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(57) Abstract 

The manipulation system and appa- 
ratus receive electronic signals which are 
to be processed as enhanced stereophonic 
audio signals from two laterally spaced 
loudspeakers in front of the listener, either 
directly before recording or broadcasting, 
or after recording, or after being broad- 
cast The system and apparatus process 
those signals to produce a conditioning sig- 
nal, such as which would be produced by 
virtual room boundaries, which is heard 
together with the original signals so that 
an enlarged listening area is perceived by 
the listener. By amplitude and phase con- 
trol of the signal to the two loudspeak- 
ers, the system and apparatus provide a 
means for control over the enhanced sound 
□eld. This enhanced sound field is per- 
ceived by the listener as being contained 
within boundaries larger than those nor- 
mally reproduced by stereophonic speak- 
ers. The system and apparatus generate 
a conditioning signal for the enhancement 
of natural, and generation of artificial, spa- 
rial qualities present in stereo signals usu- 
ally masked in the acoustic environment 
in which reproduction takes place, through 
generation of phantom boundaries. The " ' ' 

apparatus can monitor its own output and 

shut down or reduce the effects if the output contains qualities that cannot be broadcast. The apparatus provides self-adjustment in the 
electronic system to maintain spatial masking reversal at a constant value regardless of program material. — " - 
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SOUND IMAGE MANIPULATION 
APPARATUS AND METHOD FOR SOUND IMAGE ENHANCEMENT 

BACKGROUND OF THE INVENTION 

This invention is directed to an automatic sound 
5 image enhancement method and apparatus wherein the electronic 
signal which corresponds to the audio signal is electronically 
treated by amplitude and phase control to produce a perception 
of enhancements to music and sounds. The invention preferably 
operates on stereophonically recorded music (post production 

10 enhancement) or in recording and mixing stereophonic 
recordings (production) . The invention may also be used in 
connection with enhancing monophonic or monaural sound sources 
to synthesize a stereo-like effect or to locate such sources 
to positions beyond those normally found in the streo sound 

15 stage. 

Sound is vibration in an elastic medium, and 
acoustic energy is the additional energy in the medium 
produced by the sound. Sound in the medium is propagated by 
compression and refraction of the energy in the medium. The 

20 medium oscillates, but the sound travels. A single cycle is 
a complete single excursion of the medium, and the freguency 
is the number of cycles per unit time. Wavelength is the 
distance between wave peaks, and the amplitude of motion 
(related to energy) is the oscillatory displacement. In 

25 fluids, the unobstructed wave front spherically expands. 

Hearing is the principal response of a human subject 
to sound. The ear, its mechanism and nerves receive and 
transmit the hearing impulse to the brain which receives it, 
compares it to memory, analyzes it, and translates the impulse 

30 into a concept which evokes a mental response. The final step 
in the process is called listening and takes place in the 
brain; the ear is only a receiver. Thus, hearing is objective 
and listening is subjective. Since the method and apparatus 
of this invention is for the automatic * stereophonic image 

35 enhancement for human listening, the listening process is in 
perceptions of hearing. This patent describes the perceptions- 
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of human subjects. Because a subject has two ears, laterally 
spaced from each other, the sound at each eardrum is nearly 
always different. Some of the differences are due to the 
level, amplitude or energy, while others are due to timing or 
5 phase differences. Each ear sends a different signal to the 
brain, and the brain analyzes and compares both of the signals 
and extracts information from them, including information in 
determining the apparent position and size of the source, and 
acoustic space surrounding the listener. 

10 The first sound heard from a source is the direct 

sound which comes by line-of -sight from the source. The direct 
sound arrives unchanged and uncluttered, and lasts only as 
long as the source emits it. The direct sound is received at 
the ear with a frequency response (tonal quality) which is 

15 relatively true to the sound produced by the source because 
it is subject only to losses in the fluid medium (air) . The 
important transient characteristics such as timbre, especially 
in the higher registers, are conveyed by direct sound. The 
integral differences at each eardrum are found in time, 

20 amplitude and spectral differences. The physical spacing of 
the ears causes one ear to hear after the other, except for 
sound originating from a source on the median plane between 
the ears. The time delayed difference is a function of the 
direction from which the sound arrives, and the delay is up 

25 to about 0.8 millisecond. The 0.8 millisecond time delay is 
about equal to the period of 1 cycle at 1,110 Hz. Above this 
frequency, the acoustic wavelength of arriving sounds becomes 
smaller than the ear-to-ear spacing, and the interaural time 
difference decreases in significance so that it is useful only 

30 below about 1,4 00 Hz to locate the direction of the sound. The 
difference in amplitude between the sound arriving at the two 
ears results principally from the detracting and shadowing 
effect of the head and external ear pinna. These effects are 
greater above 4 00 Hz and become the source of information the 

3 5 brain interprets to determine the direction of the source for 
higher frequencies. Other clues to elevation and direction of 
the sound derive from our practice of turning^-our Jiead during 
the sound direction evaluation process. This changes the 
relative amplitude and time difference to provide further data 
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for mental processing to evaluate direction. Both processes 
are frequency dependent, but it has been shown that the time 
difference is more useful with transient portions of sound 
while both are used for evaluation of the source direction of 
5 continuous signals. 

In human listening, memory plays an important role 
in the evaluation of sound. The brain compares the interaural 
temporal difference, interaural amplitude difference, 
interaural spectral difference, as well as the precedence 
10 effect, and temporal fusion, to be described later, with 
memories of the same factors. The brain is constantly 
comparing present perceptions with stored impression so that 
those signals which are currently being received are compared 
with, memory to provide a conception of the surrounding 
15 activity. In listening, the combination of the sound as 
perceived and the memory of similar events, together, produce 
a mental image of an aural conceptual geometrical framework 
around us associated with the sources of sound to become thus 
a conceptual image space. In the conceptual image space, what 
20 is real and what seems to be real are the same. The present 
system and apparatus is directed toward generating a 
conceptual image space which seems to be real but, from an 
objective evaluation, is an illusion. 

In an apparatus where there are two, spaced 
25 loudspeaker sound sources in front of the observer, with the 
observer centered between them, the production of 
substantially the same sound from each speaker, in-phase and 
of the same amplitude, will present to the observer a virtual 
sound image midway between the two speakers. Since the sound 
30 source ,is in-phase, this virtual sound image will be called 
a "homophasic image". By changing the relative amplitude, the 
homophasic image can be moved to any point between the two 
speakers. In conventional professional processing of sound 
signals, this moving action is called "panning" and is 
35 controlled by a pan pot (panoramic potentiometer) . 

An equally convincing virtual sound image can be 
heard if the polarity is reversed on one-of the signals sent 
to one of the same two loudspeakers. This results in an 180 
degree phase shift for the sound from that speaker reaching.,- 
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the ears. For simplification, the first 0 degree retarded 
phase-shifted signal from the left speaker first reaches the 
left ear and later the right ear, simultaneously the second 
180 degree retarded phased-shif ter signal from the right 
5 speaker first reaches the right ear and later the left ear, 
providing information to the ear-brain mechanism which 
manifests a virtual sound image to the rear of the center 
point of the listener's head. This virtual image is the 
"antiphasic" image. Since it is a virtual image created by 

10 mental process, the position is different for different 
listeners. Most listeners hear the antiphasic image as 
external and to the rear of the skull. The antiphasic image 
does not manifest itself as a point source, but is diffused 
and forms the rear boundary of the listener's conceptual image 

15 space. By changing the phase relationship and/or amplitude of 
various frequencies of the left and right signals, virtual 
images can be generated along an arc or semicircle from the 
back of the observer's head toward the left or right speakers. 

Another factor which influences the perception of 

2 0 sound is the "precedence effect" wherein the first sound to 
be heard takes command of the ear-brain mechanism, and sound 
arriving up to 50 milliseconds later seems to arrive as part 
of and from the same direction as the original sound. By 
delaying the signal sent to one speaker, as compared to the 

2 5 other, the apparent direction of the source can be changed. 
As part of the precedence effect, the apparent source 
direction is operative through signal delay for up to 30 
milliseconds. The effect is dependent upon the transient 
characteristics of the signal. 

30 An intrinsic part of the precedence effect, yet an 

identifiably separate phenomenon, is known as "temporal 
fusion" which fuses together the direct and delayed sounds. 
The ear-brain mechanism blends together two or more very 
similar sounds arriving at nearly the same time. After the 

35 first sound is heard, the brain suppresses similar sounds 
arriving within about the next 30 milliseconds. It is this 
phenomenon which keeps the direct sound and room reverberation 
all together as one pleasing and natural perception of live 
listening. Since the directional hearing mechanism works on 
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the direct sound, the source of that sound can be localized 
even though it is closely followed by multiple waves coming 
from different directions. 

The walls of the room are reflection surfaces from 
5 which the direct sound reflects to form complex reflections. 
The first reflection to reach the listener is known as a first 
order reflection; the second, as second order, etc. An 
acoustic image is formed which can be considered as coming 
from a virtual source situated on the continuation of a line 

10 linking the listener with the point of reflection. This is 
true of all reflection orders. If we generate signals which 
produce virtual images, boundaries are. perceived by the 
listener. This is a phenomenon of conditioned memory. The 
position of the boundary image can be expanded by amplitude 

15 and phase changes within the signal generating the virtual 
images. The apparent boundary images broaden the perceived 
space . 

Audio information affecting the capability of the 
ear-brain mechanism to judge location, size, range, scale, 

20 reverberation, spatial identity, spatial impression and 
ambiance can be extracted from the difference between the left 
and right source. Modification of this information through 
frequency shaping and linear delay is necessary to produce the 
perception of phantom image boundaries when this information 

25 is mixed back with the original stereo signal at the 
antiphasic image position. 



SUMMARY OF THE INVENTION 

The common practice of the recording industry, for 
producing a stereo signal, is to use two or more microphones 
30 near the sound source. These microphones, no matter how many 
are used, are always electrically polarized in-phase. When the 
program source is produced under these conditions (which are 
industry standard) , the apparatus described herein generates 
a "synthetic" conditioning signal for establishment of a third 
35 point with its own time domain. This derivation is called 
synthetic because there is a separation*, alternation and 
regrouping to form the new whole. 

To further help establish a point with a separate 
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time domain, a third microphone may be used to define the 
location of the third point in relation to the stereo pair. 
Contrary to the normal procedure of adding the output of a 
third microphone to the left and right side of the stereo 
5 microphone pair, the third microphone is added to the left 
stereo pair and subtracted from the right stereo pair. This 
arrangement provides a two-channel stereo signal which is 
composed of a left signal, a right signal, and a recoverable 
signal which has its source at a related but separate position 

10 in the acoustic space being recorded. This is called organic 
derivation and it compares to the synthetic situation 
discussed above, where the ratios are proportional to the left 
minus the right (from which it was derived) but is based on 
its own time reference, which is, as will be seen, related to 

15 the spacing between the three microphones. The timing between 
the organic conditioning signal is contingent upon the 
position of the original sound source with respect to the 
three microphones. The information derived more closely 
approximates the natural model than that of the synthetically 

20 derived conditioning signal. 

Control over either the organic or synthetic 
situations, the processing thereof, and the generation of a 
conditioning signal therefrom will produce an expanded 
listening experience . 

25 All sources of sound recorded with two or more 

microphones in synthetic or organic situations contain the 
original directional cues. When acted upon by the apparatus 
of this invention, a portion of the original directional cues 
are isolated, modified, reconstituted and added, in the form 

30 of a conditioning signal, to the original forming a new whole. 
The new whole is in part original and in part synthetic. The 
control of the original-to-synthetic ratio is under the 
direction of the operator via two operating modes: 

(1) Space, in which the ratio is constant. 
3 5 Synthetic is directly proportional to the original 

and, therefore, enhancement depends upon the amount 
of original information present ~in the stereo 
program material. 

(2) Auto Space, in which the ratio is. 
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electrically varied. Synthetic is inversely 
proportional to the original and, therefore, the 
enhancement is held at a constant average 
regardless of program material. 
5 When a stereo recording is reproduced 

monophonically , it is said to be compatible if the overall 
musical balance does not change. The dimensionality of the 
stereo recording will disappear when reproduced monophonically 
but the inner-instrumental balance should remain stable when 
10 L+R (i.e., the left plus right sources have been combined to 
monophonic sound, also called L=R) . 

The compatibility problem arises because monophonic 
or the L+R signal broadcast in a conventional stereo broadcast 
does not contain the total information present in the left and 
15 right sources. When combined as such, it contains only the 
information of similarity in vectorial proportion. The 
differential information is lost. It is possible for the 
differential signal to contain as much identity about the 
musical content of a source as does the summation signal. 
20 Since differential information will be lost in left 

plus right combining, directional elements should comprise 
most of the differential signal. Directional information will 
be of little use in monophonic reproduction and its loss will 
be of no consequence with respect to musical balance. 
25 Therefore, additional dimensional or spatial producing 
elements must be introduced in such a way that their removal 
in L+R combining will not destroy the musical balance 
established in the original stereophonic production. 

Insertion of the conditioning signal at the 
30 antiphasic image position produces enhancement to and 
generation of increased spatial density in the stereo mode but 
is completely lost in the mono mode where the directional 
information will be unused. Information which can be lost in 
the mono mode without upsetting the inner-instrument musical 
35 balance includes clues relating to size, location, range and 
ambience but not original source information. 

To accomplish this, directional "Information is 
obtained exclusively from the very source which is lost in the 
monophonic mode, namely, left signal minus right signal. - .-..r- 

5 DOC ID: <WO 9416538A1 J_> 



WO 94/16538 PCT/US93/12688 . 

8 

Whether in the synthetic or organic model derivation 
of a conditioning signal, subtracting the left signal from the 
right signal and reinserting it at the antiphasic position 
will not challenge mono/ stereo compatibility, providing that 
5 the level of conditioning signal does not cause the total RMS 
difference energy to exceed the total RMS summation energy at 
the output. 

In order to aid in the understanding of this 
invention, it can be stated in essentially summary form that 

10 it is directed to a stereophonic image enhancement system and 
apparatus wherein a conditioning signal is provided and 
introduced into electronic signals which are to be reproduced 
through two spaced loudspeakers so that the perceived sound 
frame between the two loudspeakers is an open field which at 

15 least extends toward the listener from the plane between the 
loudspeakers and may include the perception of boundaries 
which originate to the side of the listener. The conditioning 
signal may be organic, if the original sound source is 
approximately miked, or it may be derived from the left and 

20 right channel stereo signals. 

In one aspect, the present invention provides an 
automatic stereophonic image enhancement system and apparatus 
wherein two channel stereophonic sound is reproduced with 
signals therein which generate a third image point with which 

25 boundary image planes can be perceived within the listening 
experience resulting in an extended conceptual image space for 
the listener. 

In another aspect, the present invention provides 
a stereophonic image enhancement system which includes 

30 automatic apparatus for introducing the desired density of 
conditioning signal regardless of program content into the 
electronic signal which will be reproduced through the two 
spaced speakers. 

It is another objective to provide an automatic 

35 stereophonic image enhancement system and apparatus wherein 
the inner-instrumental musical balance remains stable when 
heard in monophonic or stereophonic modes - of -reproduction . 

It is another objective to provide a monophonically 
compatible automatic stereophonic image enhancement system and 
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apparatus wherein the operator can be readily trained to 
employ the system and apparatus to achieve desirable 
recordings with enhanced conceptual image space. 

The features of the present invention which are 
5 believed to be novel are set forth with particularity in the 
appended claims. The present invention, both as to its 
organization and manner of operation, together with further 
objects and advantages thereof, may be best understood by 
reference to the following description, taken in conjunction 
10 with the accompanying drawings. 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a perspective view of listener facing 
two spaced loudspeakers, and showing the outline of an 
enclosure . 

15 Figure 2 is a schematic plan view of the perception 

of a sound frame which includes a synthetic conditioning 
signal which is included in the signals to the speakers. 

Figure 3 is a schematic plan view of the perceived 
open field sound frame where an organic conditioning signal 
20 is introduced into the signal supplied to the speakers. 

Figure 4 is a schematic plan view of the open field 
sound frame, as perceived from the listener's point of view, 
as affected by various changes within the conditioning signal. 

Figure 5 is a schematic plan view of a sound source 
25 and microphone placements which will organically produce a 
conditioning signal . 

Figure 6 is a simplified schematic diagram of a 
circuit which combines the organically derived conditioning 
signal^ with the left and right channel signals. 
30 Figures 7(a) and 7(b) form a schematic electrical 

diagram of the automatic stereophonic image enhancement system 
and apparatus in accordance with this invention. 

Figure 8 is a schematic electrical diagram of an 
alternate circuit therefore. 
35 Figure 9 is a front view of the control panel for 

the apparatus of Figure 8. 

Figures 10(a) and 10(b) form a digital logic diagram 
of a digital embodiment of the invention. " - 
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Figure 11 is a front view of a joystick box, a 
control box, and a interconnecting data cable 420 which can 
be used to house the embodiment of the invention described 
with reference to Figures 12(a) , 12(b), 13 (a) -13(f), 14(a) and 
5 14(b). 

Figures 12(a) through 12(d) form a schematic diagram 
of an embodiment of the invention wherein joysticks may be 
used to move a sound around in a perceived sound field. 

Figures 13 (a) -13(f) are graphical representations 
10 of the control outputs which are generated by the joysticks 
and associated circuitry and applied to voltage controlled 
amplifiers of Figures 12 (a) -12(d). 

Figures 14(a) and 14(b) form a digital sound 
processor logic diagram similar to that of Figures 10(a) and 
15 10(b), but adapted for use as the digital sound processor 450 
in Figures 12 (a) -12(d) 

Figure 15 is a schematic diagram of an embodiment 
of the invention which is adapted for use in consumer quality 
audio electronics apparatus, of the type which may be used in 
2 0 the home, etc. 

Figure 16 is a block diagram of an embodiment of the 
invention adapted for use in consumer-quality audio electronic 
apparatus, which embodiment includes an automatic control 
circuit for controlling the amount of spatial enhancement 
25 which the circuit generates. 

Figures 17A and 17B may be joined to form a 
schematic diagram corresponding to the block diagram of Figure 
16. 

Figure 18 is a block diagram of another embodiment 
30 of the invention, which block diagram includes an integrated 
circuit implementing the circuitry of Figure 15 or of Figures 
16, 17a and 17b. 

Figure 19 is a block diagram of another embodiment, 
similar to the embodiment of Figure 18, but providing for 
35 multiple inputs. 



BRIEF DESCRIPTION OF THEJTABLES 

Tables A through F set forth the data which is 
graphically presented in Figures 13 (a) -(f), respectively. 
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Tables G through X set forth additional data. 



DETAILED DESCRIPTION OP EMBODIMENTS OF THE INVENTION 

Figure 1 illustrates the usual physical arrangement 
of loudspeakers for monitoring of sound. It should be 
5 understood that in the recording industry sound is "monitored" 
during all stages of production. It is "reproduced" when 
production is completed and the product is in the market 
place. At that point and on, what is being reproduced is the 
production. Several embodiments of the invention are 
10 disclosed. Some embodiments are intended for use during sound 
production, while one embodiment is intended for use during 
sound reproduction, in the house, for example. Embodiments of 
the invention include the system and apparatus illustrated in 
a first embodiment in Figures 5 and 6, a second embodiment 10 
15 in Figure 7, a third embodiment 202 in Figure 8, a fourth 
embodiment of Figures 10(a) and 10(b), a fifth and presently 
preferred embodiment (for professional studio use) in Figures 
11, 12(a), 12(b), 13 (a) -13(f), 14(a) and 14(b). These 
embodiments may be employed in record, compact disc, 
20 mini-disc, cassette, motion picture, video and broadcast 
production, to enhance the perception of sound by human 
subjects, i.e. listeners. Another and sixth embodiment, which 
is disclosed with reference to Figure 15, may be used in a 
consumer quality stereo sound apparatus found in a home 
2 5 environment, for example. 

During monitoring of sound for sound production, the 
two loudspeakers 12 and 14 are of suitable quality with 
enclosures to produce the desired fidelity. They are laterally 
spaced, and the listener 16 faces them and is positioned 
30 substantially upon a normal plane which bisects the line 
between the speakers 12 and 14. Usually, the listener is 
enclosed in a room, shown in phantom lines, with the 
loudspeakers. During reproduction, the two loudspeakers may 
be of any quality. The loudspeaker and listener location is 
35 relatively unimportant. During monitoring, the effect is one 
of many separate parts being blended- ~ *tog~ether . Hence, 
monitoring requires a standard listening position for 
evaluating consistency, whereas during reproduction, the^ 
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effect has become one with the whole sound and can be 
perceived from any general location. 

Since several embodiments of the apparatus are 
designed as a production tool, the loudspeakers 12 and 14 
5 should be considered monitors being fed from an electronic 
system which includes the sound production enhancement 
apparatus of this invention. The electronic system may be a 
professional recording console, multi-track or two-track 
analogue, or digital recording device, with a stereophonic 

10 two-channel output designated for recording or broadcasting. 
The sound source material may be a live performance or it may 
be recorded material in a combination of the foregoing. 

Figure 2 illustrates the speakers 12 and 14 as being 
enclosed in what is perceived as a closed field sound frame 

15 24 (without the lower curved lines 17 and 26) which is 
conventional for ordinary stereophonic production. By varying 
the amplitude between the speakers 12 and 14, the apparent 
source can be located anywhere within the sound frame 24, that 
is, between the speakers. When a synthetic conditioning 

20 signal is reinserted at the antiphasic image position 34, 
amplitude and time ratios 17 are manifested between the three 
points 12, 14 and 34. Because the antiphasic point 34 is the 
interdependent product of the left point 12 and the right 
point 14, the natural model is approached by a synthetic 

25 construction, but never fully realized. The result is open 
field sound from 26. Listener 16 perceives the open field 26. 

Figure 3 illustrates open field sound frame 28 which 
is perceived by listener 16 when a conditioning signal 
derived, as in Figure 2, is supplied and introduced as part 

30 of the signal to speakers 12 and 14, but has as its source an 
organic situation. The density of spatial information is 
represented by the curved lines 17 in Figure 2 and is 
represented by the curved lines 19 in Figure 3. It is apparent 
that the density of spatial information is greater in Figure 

35 3 because the three points which produced the original 
conditioning signal are not electrically interdependent but 
are acoustically interactive; information— more closely 
reflecting the natural model is supplied to the ear-brain 
mechanism of listener 16. ~" 
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Figure 4 illustrates the various factors which are 
sensed by the listener 16 in accordance with the stereophonic 
image enhancement systems of this invention. The two speakers 
12 and 14 produce the closed field sound frame 24 when the 
5 speakers are fed with homophasic signals. Homophasic image 
position 30 is illustrated, and the position can be shifted 
left and right in the frame 24 by control of the relative 
amplitude of the speakers 12 and 14. The speakers 12 and 14 
produce left and right real images, and a typical hard point 
10 image 32 is located on the line between the speakers because 
it is on a direct line between the real images produced by the 
two real speakers. As described above, tjie hard point source 
image can be shifted between the left and right speakers. 

The antiphasic image position 34 is produced by 
15 speakers 12 and 14 and may be perceived as a source location 
behind the listener's head 16 at 34 under test or laboratory 
demonstrations. Under normal apparatus operating conditions, 
source 34 is not perceived separately but, through temporal 
fusion, is the means by which an open filed sound frame is 
20 perceived. Position 34 is a perceived source, but is not a 
real source. There is no need for a speaker at position 34. 
Rather, by controlling the relationship between the antiphasic 
image position and one or both of the real images all produced 
by speakers 12 and 14, the image source can be located on a 
25 line between one of the real images and the antiphasic image 
position 34. Since the antiphasic image position 34 is a 
perceived source, but is not a real source, the point between 
it and speakers 12 and 14 is considered a soft point source 
image. Such a soft point source image is shown at point 36. 
30 Open field sound frame is thus produced and provides the 
perception of virtual space boundaries 40, 42, 44 or 46 (not 
on line), depending on the conditioning signal's phase 
relationship to the original source. The perceived distance 
for the virtual space boundaries 40, 42, 44 and 46 from the 
35 closest hard point is from 2 to 30 feet (approximately 1-10 
meters) , depending on the dimension control setting of Figure 
5 and the distance between speakers 12 and* 14*" 

Figure 18 is a schematic diagram of an Eighth 
embodiment of the invention, which embodiment modifies the"- 
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embodiment described in reference to Figure 15. 

Figure 19 is a schematic diagram of a Ninth 
embodiment of the invention, which is similar to the Eighth 
embodiment, but which includes panning pots connected to the 
5 inputs of the circuitry such as via a conventional recording 
counsel. 

First Embodiment 

Figure 5 is a schematic diagram of a sound source 
which is to be recorded or amplified. Three microphones L, R 

10 and C are shown located in front of the source. The L (left) 
and R (right) microphones are approximately equally spaced 
from the source on its left and right sides. The C 
(conditioning) microphone is located further spaced from the 
source and approximately equally spaced from the L and R 

15 microphones. 

The signal from the C microphone is adjusted in gain 
and then is added (at adder A, for example) and subtracted (at 
subtractor S, for example) from the stereo signals L, R as 
shown in Figure 6. The resulting signal processed outputs PL 

20 and PR, when amplified and applied to speakers 12 and 14 
(Figure 1) , will produce an expanded sound image as described 
with reference to Figures 3 and 4. By adjusting the gain of 
conditioning signal, C, the amount of expansion which occurs 
can be controlled easily. In this embodiment, the conditioning 

25 signal, C, is produced organically, that is, by a microphone 
array pickup as shown in Figure 5 and connected as shown in 
Figure 6. There exist many previous stereo recordings for 
which there was no microphone at location C connected as shown 
in Figure 6, and thus there would seem to be no simple way of 

3 0 recreating the effect described above. However, as will be 
seen, the conditioning signal can be created synthetically, 
and introduced into the left and right channel signals, when 
(1) the sound source is mixed-down from a prerecorded tape in 
a recording studio, for example, (2) the sound is broadcast, 

3 5 or (3) when prerecorded sound is received or reproduced in a 
home environment. The conditioning signajL is derlaye.d time-wise 
and filtered compared to the signals from microphones L and 
R due to the placement of microphone C. ~" 
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Second Embodiment 

Now considering an embodiment of the apparatus 100 
which produces the conditioning signal synthetically, the left 
input lines 48 and 49 and right input lines 50 and 51 are 
5 received from musical signal sources- The system and apparatus 
10 is described in this embodiment as being a system which 
introduces the conditioning signal before the two-channel 
recording and, thus, is a professional audio laboratory and 
apparatus. Thus, the left and right inputs 48, 49, 50 and 51 

10 may be the product of a live source or a mixdown from a 
multiple channel tape produced by the live recording, or it 
may be a computer generated source, or a mixture of same. The 
inputs of the apparatus 48, 49, 50 and 51 addresses the output 
of the recording console's "quad bus" or "4-track bus". Each 

15 position on the recording console can supply each and every 
bus of the quad bus with a variable or "panned" signal 
representing a particular position. Two channels 49, 51 of the 
quad bus are meant for use as stereo or the front half of 
quadraphonic sound; the other two channels, 48, 50, are for 

20 the rear half of quadraphonic sound. Normally, each position 
or input of a modern recording console has a panning control 
to place the sound of the input between left, right, front or 
back via the quad bus. A recording console may have any number 
of inputs or positions which are combined into the quad bus 

25 as four separate outputs. The left front quad bus channel 
address apparatus input 49; the right front quad bus channel 
addresses apparatus input 51; the left rear quad bus channel 
addresses apparatus input 48; and, the right rear quad bus 
channel address apparatus input 50. Alternate insertion of the 

30 apparatus of Figure 7 is possible in the absence of a quad bus 
by using the stereo bus plus two effect buses. Left front 
input 49 (unprocessed) is connected to amplifier 52. Left rear 
input 48 (to processed) is connected to amplifier 54. Right 
rear input 50 (to processed) is connected to amplifier 56. 

35 Right front input 51 (unprocessed) is connected to amplifier 
58. The outputs of amplifiers 52 and 58 are respectively 
connected to adders 60 and 62, respectively - so that 
amplifiers 52 and 58 effectively bypass the enhancement system 
100. The use of the quad bus allows the apparatus to address-- 



WO 94/16538 PCT/US93/12688 

16 

its function to each input of a live session or each track of 
recorded multi-track information, separately. This means that, 
in production, the operator/engineer can determine the space 
density of each track rather than settling for an overall 
5 space density. This additional degree of creative latitude is 
unique to this apparatus and sets it apart as a production 
tool . 

The amplified left and right signals in lines 68 and 
70 are both connected to summing amplifier 72 and differencing 

10 amplifier 74. The output in line 76 is, thus, L+R, but the 
amplifier 72 also serves to invert the output so that it 
appears as - (L— R) . These sum and difference signals in lines 
7 6 and 78 are added together in adder 60 and generate the left 
program with a conditioning signal C L which adds additional 

15 spatial effects to the left channel. The signal in line 78 
also goes through invertor 80 to produce in line 82 the (L-R) 
signal. Lines 76 and 82 are introduced into adder 62 to 
generate in its output line 84 the right program with 
conditioning signal C R . The output lines 79 and 84 from adders 

20 60 and 62 go to the balanced-output amplifiers 86 and 88 for 
the left output and 90 and 92 for the right output. The output 
amplifiers are preferably differential amplifiers operating 
as a left pair and a right pair, with one of each pair 
operating in inverse polarity with the other half of each pair 

2 5 for balanced line output. 

The conditioning signals C L and C R are similar to 
conditioning signal C of Figure 6, but are synthetically 
produced. Also, they have somewhat different frequency 
filtering which tends to broaden the rear sound images, 

30 particularly the antiphasic position 34 (Figure 4) . 
Conditioning signals C L and C R derived from the difference 
signal -(L-R) in line 78 at the output of differencing 
amplifier 74. The difference signal in line 78 passes through 
high pass filter 94 which has a slope of about 18 decibels per 

35 octave and a cutoff frequency of about 300 Hz to prevent comb 
filtering effects at lower frequencies. The filtered signal 
preferably, but not necessarily, passes ^through de-lay 96 with 
an adjustable and selectable delay as manually input from 
manual control 98, which is called "the Dimension Control!*- . 
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The output of the delay 96 goes to voltage controlled 
amplifier (VCA) 102 which provides level control. The DC 
control voltage in line 104 which controls voltage control 
amplifier 102, is supplied by potentiometer 106, in the Manual 
5 Mode, or by the hereinafter described control circuit in the 
Automatic Mode. Potentiometer 106 provides a DC voltage 
divided down from a DC source 107. It functions as a "Space 
Control" and it effectively controls the amount of expansion 
of the sound perceived by a listener, i.e., it controls the 
10 amount of the conditioning signal which is added and 
subtracted from the left and right channel signals. 

The output from voltage controlled amplifier 102 in 
line 108 is preferably connected via left equalizer 110 and 
right equalizer 112 for proper equalization and phasing for 
15 the individual left and right channels, which tends to broaden 
the rear image. The illustrated equalizers 110 and 112 are of 
the resonant type (although they could be any type) with a 
mid-band boost of 2 db at a left channel center frequency in 
equalizer 110 of about 1.5 kilohertz and a right channel 
20 frequency in equalizer 112 of about 3 kilohertz. After passing 
through the equalization circuits, the left conditioning 
signal -C L occurs in line 114 and the right conditioning 
signal -C R occurs in line 116. The left conditioning signal 
-C L is added in adder 60. The right conditioning signal in 
25 line 116 is connected to invertor 80 where the conditioning 
signal -C R is added to the difference signal - (L-R) and the 
sum is added to the sum signal to result in the right signal 
minus right conditioning signal on line 84 and left signal 
plus left conditioning signal on line 79. 
30 * The automatic control circuit generally indicated 

at 118 monitors the output signal in line 79 and 84 and 
regulates the amount of conditioning signal to keep a 
Lissajous figure generated on an X-Y oscilloscope, connected 
to the outputs, relatively constant. The Lissajous figure is 
35 a figure displayed on the CRT of an oscilloscope when the two 
outputs are connected to the sweep and amplitude drives of the 
oscilloscope. When the Lissajous figure 'is * fairly "round, the 
energy ratio between the sum and difference of the two outputs 
is substantially equal (a desirable characteristic). Lines &4~ 
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and 79 are respectively connected to the inputs of 
differencing amplifier 120 and adding amplifier 122. The 
outputs are respectively rectified, and rectifiers 124 and 126 
provide signals in line 128 and 130. The signal in lines 128 
5 and 130 are, thus, the full wave rectified sum and difference 
signals of the apparatus output respectively out of subtractor 
120 and adder 122. 

Lines 128 and 130 are connected to filter 132 and 
134 which have adjustable rise and fall ballistics. Selector 

10 switch 13 6 selects between the manual and automatic control 
of the control voltage in line 104 to voltage controlled 
amplifier 102. The manual position of selector switch 136 is 
shown in Figure 7(a), and the use of the space expansion 
control potentiometer 106 has been previously described. There 

15 are several individual switches controlled by selector switch 
136, as indicated in Figure 7(a). When the space control 
switch is switched to the other, automatic position, the 
outputs of filters 132 and 134 in lines 138 and 140, 
respectively, are processed and are employed to control 

20 voltage control amplifier 102. 

When space control selector switch 13 6 is in the 
automatic position, the output of error amplifier 14 2 is 
connected through gate 144 to control the voltage in line 104. 
The error amplifier 142 has inputs directly from line 138 and 

25 from 140 through switch segment 146 and back through line 148. 
The filtered sum signal in line 140 is connected through the 
space expansion potentiometer 106 so that it can be used to 
reduce the apparent level of the output sum information to 
error amplifier 142 to force the error amplifier 142 to reduce 

30 the sum/difference ratio. 

Comparator 150 is connected to receive the filtered 
sum and difference information in lines 138 and 140. 
Comparator 150 provides an output into gate line 152 when 
space control selector switch 136 is in the automatic mode and 

35 when a monophonic signal is present at inputs 48 and 50. This 
occurs, for example, when an announcer speaks between music 
material. When comparator 150 senses monophonic material , gate 
line 152 turns off gate 144 to shut down voltage controlled 
amplifier 102 to stop the conditioning signal*. This is done 
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to avoid excessive increase in stereo noise, from random phase 
and amplitude changes, while the input program material is 
fully balanced. The automatic control circuit 118 cannot 
distinguish between unwanted noise and desired program 
5 material containing difference information. Therefore, a 
threshold ratio is established between the sum and difference 
information in lines 138 and 140 by control of the input 
potentiometer into comparator 150. The comparator 150 and gate 
144 thus avoid the addition of false space information in a 

10 conditioning signal which, in reality, would be response to 
difference-noise in the two channels. The comparator 150 thus 
requires a specific threshold ratio between the sum and 
difference information, under which the gate 144 is turned off 
and over which the gate 144 is turned on. 

15 Clipping circuit 153, see the center left of Figure 

7(a), is provided to present a signal when the system is 
almost in a clipping situation and another signal when 
clipping is present- "Clipping" is a rapid increase in 
distortion caused by dynamic peaks in the program material 

20 being limited by the static limit imposed by the power supply 
voltage in the circuit. Lines 154 and 156 which are the inputs 
of amplifiers 52 and 58, are connected, along with lines 68, 
70, 79 and 84, each through their own diode to bus 158. Bus 
158 is connected through a resistance to input 160 of 

25 comparator 162. A negative constant voltage source is 
connected through another resistor to the input 160, and the 
comparator 162 is also connected to ground. By management of 
the two resistors, the comparator 162 has an input when bus 
158 reaches a particular level. When that level is reached, 

30 output * signal 164, such as a signal light, is actuated. Bus 
158 is similarly connected through a resistor to the input 166 
of comparator 168. The negative voltage source is connected 
through another resistor to input 166, and the resistance 
values are adjusted so that comparator 168 has an input when 

35 clipping is taking place. Latching circuit 17 0 is actuated 
when clipping has taken place to illuminate the two signal 
lights 172 and 174. Those lights stay illuminated -until reset 
176 is actuated. 

In the cutting of V-groove stereo records,. a_ 
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difference signal results in vertical motion. Vertical motion 
is the most difficult to track in playback. Therefore, large 
signals which produce too much vertical motion when referenced 
to lateral motion are usually avoided. It can be considered 
5 saturation of the cutting function. Not exceeding the 
saturation point is extremely important in proper disk 
cutting. In FM broadcasting, similar restraints still apply, 
since governmental regulatory bodies tend to require that the 
difference signal be kept less than the L+R signal. Therefore, 

10 a detection circuit 178 is shown in the lower right corner of 
Figure 7(b) . The rectified sum and difference signals in lines 
130 and 128 are connected to peak followers 180 and 182. The 
peaks generated by peak followers 180 and 182 are connected 
to comparators 184 and 186. Comparator 184 gives an output 

15 pulse whenever the difference peak envelope becomes greater 
than the sum peak envelope, within plus or minus 3 dB. The 
level controls at the outputs of the peak followers 180 and 
182 allow an adjustment in the plus or minus 6 dB difference 
for different applications. Comparator 186 has an output when 

2 0 sum/difference peak ratio approaches the trigger point of 
comparator amplifier 184 within about 2 dB, and lights signal 
light 188 on the front panel, illustrated in Figure 7(b), as 
a visual warning of approaching L-R overload. This is 
accomplished by reducing the apparent level of the sum 

2 5 envelope by about 2 dB with the potentiometer connecting 

comparator 186 to ground. The output of comparator amplifier 
184 feeds a latching circuit 190 which activates light 195 and 
which holds until reset by switch 192. When the latching 
circuit is active, it activates driving circuit 194 which 
30 lights panel lights 196 and 197 and, after a time delay, rings 
audible alarm 198. At the same time, driving circuit 194 
energizes line 199 which cuts off gate 144 to withhold the 
signal to amplifier 102 which controls the conditioning 
signal. Actuation of gate 144 removes the conditioning signal 

3 5 from line 108, but permits the normal stereo signal to 

continue through the circuit. 

Third Embodiment 

A third embodiment of the system and" apparatus of... . 
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this invention is shown in Figure 8 and is generally indicated 
at 200* For reasons already stated with respect to the system 
and apparatus 100 of Figures 7(a) and (b) , the left front quad 
bus channel address unprocessed input 49 which is connected 
5 to amplifier 204; the left rear quad bus channel address 
processed input 48 which is connected to amplifier 206; the 
right rear quad bus channel address processed input 50 which 
is connected to amplifier 212; and, the right front quad bus 
channel address unprocessed input 51 which is connected to 

10 amplifier 214. Amplifiers 204, 206, 212 and 214 are inverting 
and provide signals in lines 208, . 210, 216 and 218, 
respectively. Both lines 208 and 210 are connected to summing 
amplifier 220, while both lines 216 and 218 are connected to 
summing amplifier 222. Lines 210 and 216 carry -L and -R 

15 signals. 

The conditioning signals C R and -C L are derived by 
connecting differencing amplifier 224 to both lines 210 and 
216. The resulting difference signal, -(R-L,), is filtered in 
high pass filter 226, similar to filter 94 in Figure 7(a), and 

20 the result is subject to selected delay in delay circuit 228. 
The delay time is controlled from the front panel, as will be 
described with respect to Figure 9. The output from delay 228 
goes through voltage controlled amplifier 230 which has an 
output signal, -C, in line 232, which is supplied to both 

25 non-inverting equalizer 234 and inverting equalizer 236. Those 
equalizers respectively have conditioning signal outputs -C L 
and +C R which are connected to the inverting summing 
amplifiers 220 and 222. The left conditioning signal -C L is 
added (and inverted) with the original left signal at 

30 amplifier 220 to form L+C L , and the right conditioning signal 
+C R is effectively subtracted from the original right signal 
at invertor amplifier 222 to form R-C R . The outputs from 
amplifiers 220 and 222, in lines 238 and 240, respectively, 
are preferably and respectively connected to balanced left 

35 amplifiers 242 and 244 and balanced right amplifiers 246 and 
248, in the manner described with respect to amplifiers 86 
through 92 of Figure 7(b). It may be useful to connect the 
various points in the circuit of Figure 8 to the clipping and 
L-R overload warning circuits 153 and 178 in the same manner 1 " 
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as previously described with reference to Figure 7(b). 
Alternatively, VCA 230 may be manually controlled by a 
potentiometer and DC supply combination, such as potentiometer 
106 and supply 107. The difference between the two embodiments 
5 of the system in Figures 7(a), 7(b) and 8 lies in the way the 
original left and right signals are routed. In Figures 7(a) 
and (b) , the left and right signals are added and subtracted. 
This sum and difference information is then re-added and 
re-subtracted to reconstruct the original left and right 

10 signals. In the circuit of Figure 8, the original left and 
right signals are not mixed together. They remain independent 
of each other from input to output. 

The enhancement system may be automatic with 
self-controlling features in the apparatus so that the 

15 stereophonic image enhancement can be achieved without 
continual adjustment of the system and apparatus. 
Alternatively, manual control may be used, if desired. 

The foregoing description of the invention, as it 
has been described with reference to the detailed circuitry 

20 shown in Figures 7(a), 7(b) and 8, has been basically in 
analog terms with the various elements of the circuitry being 
either analog devices or devices which could be either analog 
or digital. For example, the delay line devices 96 (Figure 
7(a)) and 228 (Figure 8) are more likely to be implemented 

25 using digital components than by using analog components. 
Thus, an analog to digital converter might be used immediately 
prior to a linear digital delay line 96, 228 whose output can 
than be converted to analog using a digital to analog 
converter . 

30 Alternatively, and preferably for the professional 

equipment, predominately digital implementations of the 
invention are quite practicable, as will be seen in the 
following embodiments . 

Fourth Embodiment 

35 Turning now to Figures 10(a) and 10(b), they form 

a digital logic diagram of a digital, -embodiment of the 
invention which is conceptually somewhat similar to the 
analog, or mostly analog, embodiment of Figures 7(a) and (b) , 
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In Figures 10(a) and (b) , data transmission lines are shown 
in solid lines, while control lines are shown in dashed lines. 
Left and right audio channel information is supplied 
in multiplexed digital format an input 302. Clock information 
5 is also supplied at an input 304 to a formatter 306 which 
separates the left channel information from the right channel 
information. Preferably, formatter 306 de-multiplexes the 
digital data which can be supplied in different multiplexed 
synch schemes. For example, a first scheme might assume that 
10 the data is being transmitted via a Crystal Semiconductor 
model CS8402 chip for AES-EBU, S-PDIF inputs, or a second 
scheme might assume that the digital data^comes from an analog 
to digital converter such as a Crystal Semiconductor model 
CS5328 chip. The I/O mode input 305 preferably advises the 
15 formatter 3 06 at the front end and the formatter 37 0 at the 
rear end of the type of de-multiplexing and multiplexing 
schemes required for the chips upstream and downstream of the 
circuitry shown in Figures 10(a) and (b) . Those skilled in the 
art will appreciate that other multiplexing and 
20 de-multiplexing schemes can be used or that the left and right 
channel data could be transmitted in parallel, i.e., 
non-multiplexed data paths. 

The left channel digital audio data appears on line 
308 while the right channel digital audio data appears on line 
25 309. This data is subtracted from each other at a subtractor 
324 to form R-L data. The R-L data is supplied to a switch 329 
and may be filtered though a high-pass filter 326, and a low 
pass filter 327 and is subjected to digital time delay at 
device 328. The signal is filtered by filter 310 having a 
30 narrow band pass preferably centered at 500 Hz with 6 
dB/octave slopes on either side of its center frequency. 
Those skilled in the art will appreciate that filter 326, 327 
and 310 are represented as they might be, i.e., as separate 
filters, in an analog embodiment. In a digital embodiment, 
35 the functions of filters 326, 327 and 310 are preferably 
implemented in a digital signal processor and they, along with 
delay 328, be preformed in different- sequences and the 
functions can be combined as a matter of design choice. If 
desired, filters 326, 327 may be eliminated. " - — - 
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Switch 329 is controlled by a C-roode control 303 
which effectively controls the position of switch 329, which 
is shown in Figures 10(a) and (b) , in its C-mode position, 
that is, where the filters 326 , 327 and 310 and the time delay 
328 are bypassed. The C-mode is preferably used when the 
apparatus is used with live sources, such as might be 
encountered during a concert or a theatrical performance, and 
a C microphone input source (Figures 5 and 6) is available, 
so that the C signal then need not be synthetically produced. 
The R-L data is preferably subjected to the filtering and time 
delay to generate the conditioning signal C when the invention 
is used to mixdown a recorded performance from a multi-track 
tape deck, for example. 

The output from switch 329 is supplied to a variable 
15 gain digital circuit 330 which is functionally similar to the 
voltage controlled amplifier 102 shown in Figure 7(b). A mute 
control input can be used to reduce the gain at gain control 
330 very quickly, if desired. The output of variable gain 
digital circuit 330 is applied to an adder 320 and to a 
20 subtractor 3 32 so that the control signal C is added and 
subtracted from the left audio data and right audio data on 
lines 379 and 384, respectively. That data is then multiplexed 
at formatter 370 and output in digital form at serial output 
390. 

25 The variable gain circuitry 330, which can be 

implemented rather easily in the digital domain by shifting 
bits, for example, is controlled either from a manual source 
or an automatic source, much like the voltage controlled 
amplifier 102 of Figure 7(b). In the manual position of switch 

30 367 shown in Figure 10(b), the gain through circuitry 330 is 
controlled by a "space control" input 3 62 which is 
conceptually similar to the space control potentiometer 106 
shown in Figure 7(a) and the potentiometer shown Figure 6. In 
the automatic position of switch 367, the gain in circuitry 

35 330 is automatically controlled in a manner similar to that 
of Figures 7(a) and (b) . In Figures 10(a) and (b) , the data 
on lines 379 and 384 are summed at a summer -34 2 ..and, at the 
same time, subtracted at subtractor 340. The outputs are 
respectively applied to high-pass filters 34^ and 344, whose. 
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outputs are in turn applied to root mean square (RMS) 
detectors 350 and 348, respectively. Detector 348 outputs a 
log difference signal, while detector 350 outputs a log sum 
signal. The value of the log difference signal from detector 
5 348 can be controlled from the "Space In" input 362 at adder 
352, in the automatic mode, so that the "Space In" value 
offsets the output of the log difference detector, either: 



(1) 00 for a difference level 12 dB below the sum 

level ; 

10 (2) 80 for a difference level equal to the sum 

level; and 

(3) FF for a difference level 12 dB over the sum 

level . 

The output of adder 3 52 and the log sum output from 

15 detector 350 are applied to a comparator 354 , which is 
conceptually similar to the comparator 150 of Figure 7(a). The 
output of comparator 354 is applied to a rate limiter 3 56 
which preferably limits the rate at which the output from 
comparator 354 limits the rate of gain change of circuit 330 

2 0 to approximately 8 dB per second. 

Those skilled in the art will appreciate that the 
circuitry shown in Figures 10(a) and (b) , instead of 
implementing it in discrete digital circuitry, preferably can 
be implemented by programming a digital signal processor chip, 

25 such as the model DSP 56001 chip manufactured by Motorolla, 
by known means. 

The automatic control circuitry 378 is also shown 
in Figures 10(a) and (b) . When switch 367 is in its automatic 
position, the automatic control circuitry 378 effectively 

30 controls the amount of spatial effect added by the invention 
depending upon the amount of spatial material initially in the 
left and right audio. That is to say, if the left and right 
audio data being input into the circuitry has high spatial 
impressions already, the amount of spatial effect added by the 

35 present invention is less than if the incoming material has 
less spatial impression information in-- it originally. The 
control circuitry 378 also helps to keep the envelope of the 
L-R signal less than the envelope of the L+R signal. That can- 
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be important for FM and television broadcasting where 
governmental agencies, such as the FCC in the United States, 
often prefer that the broadcast L-R signal be no greater than 
the L+R signal. Thus, the embodiment of the invention 
5 disclosed with respect to Figures 10(a) and (b) , is 
particularly useful in connection with the broadcast industry 
where the spatial effects added by the circuitry can be 
automatically controlled without the need for constant manual 
input. It also should be emphasized that the present invention 

10 is completely mono-compatible, that is to say, when the 
present invention is used to enhance the spatial effects in 
either a radio FM broadcast or a Television sound FM 
broadcast, those receivers which are not equipped with stereo 
decoding circuitry, do not produce any undesirable effects in 

15 their reproduction of the L+R signal due the spatial effects 
which are added by the present invention to the L-R signal 
being broadcast. 

The R/L equalization on line 312 controls the amount 
of boost provided by filter. That boost is currently set in 

20 the range of 0 to +8 dB and more preferably at +4 dB. The 
center frequency of filter 310 is preferably preferably set 
at 500 Hz, but it has been determined filter 310 may have 
center frequencies in the range of 300 Hz to 3 kHz. 

The WARP In input to time delay 328 adjusts the time 

25 delay. The time delay is preferably set at zero delay for 
audio reproduction, 1.0 mSec for broadcasting applications, 
4-6 mSec for mechanical record cutting, and up to 8 mSec for 
cinematic production applications. 

Fifth Embodiment 

30 While the automatic mode version of the present 

invention can be very useful in broadcasting, the manual mode 
of operation of the present invention will be very important 
for the recording industry and for the production of theater, 
concerts and the like, that is, in those applications in which 

35 large multichannel sound mixing panels are currently used. 
Such audio equipment usually has a reasonable number of audio 
inputs, or audio channels, each of which are essentially mono. 
The sound recording engineer has control of "' not only the.- 
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levels of each one of the channels but also, in the prior art, 
uses a pan control to control how much of the mono signal 
coming into the sound board goes into the left channel and how 
much goes into the right channel. Additionally, the engineer 
5 can control the amount of the signal going to the rear left 
and rear right channels on a quad bus audio board. 

Turning again to Figure 4, the pan control of the 
prior art permits a sound source point image 3 2 to be located 
anywhere on the line between the left and right speakers 12 

10 and 14 depending on the position of the pan control. For that 
simple reason, stereo recording was a large improvement over 
the mono recordings of forty years ago. Just imagine, however, 
the even greater effect which can impart to a listener 16 if 
the image point can be moved anywhere: not only between the 

15 two speakers, but to the left of the left speaker or to the 
right of the right speaker, to the foreground position (such 
as point 36) shown in Figure 4, or even to a point behind the 
listener such as the antiphasic image position 34 shown in 
Figure 4. The present invention provides audio engineers with 

2 0 such capabilities. Instead of using two pan controls such as 

can be found on a quad deck, the audio engineer will be 
provided with a joystick by which he or she will be able to 
move the sound image both left and right and front and back 
at the same time. The joystick can be kept in a given position 
25 during the course of an audio recording session, a theatrical 
or concert production, or alternatively, the position of the 
joystick can be changed during such recording sessions or 
performances. That is to say, the image position of the sound 
can be moved with respect to a listener 16 to the left and 

3 0 right and forward and back, as desired. If desired, the 

effective position of the joystick can be controlled by a MIDI 
interface. 

Initially, in connection with the audio recording 
and mix down industries, the present invention will likely be 
35 packaged as an add-on device which can be used with 
conventional audio mixing boards. In the future, however, the 
present invention will likely find its-way into-the audio 
mixing board itself, the joystick controls (discussed above) 
being substituted for the linear pan control " of present*-.-- 
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technology audio mixing boards. 

Figure 11 shows the outward configuration of audio 
components using the present invention which can be used with 
conventional audio mixing boards known today. As shown in 
5 Figure 11, the device has twenty-four mono inputs and 
twenty-four joysticks, one for each input. Preferably, the 
equipment comprises a control box 400 and a number of joystick 
boxes 410 which are coupled to the control box 400 by a data 
line 420. The joystick box 410 (shown in Figure 11) has eight 

10 joysticks associated with it and is arranged so that it can 
be daisy-chained with other joystick boxes 410, coupling with 
the control box 400 by data cable 420 in a serial fashion* 
Instead of having only eight joysticks in joystick box 410, 
the joystick box 410 could have all twenty-four joysticks, one 

15 for each channel, and, moreover, the number of joysticks and 
channels can be varied as a matter of design choice. At 
present it is preferred to package the invention as shown, 
with eight joysticks in one joystick box 410. In due time, 
however, it is believed that this invention will work its way 

20 into the audio console itself, wherein the joysticks will 
replace the panning controls presently found on audio 
consoles. 

This embodiment of the invention has enhanced 
processed, left and right outputs 430 and 432 wherein all the 

25 inputs have been processed left and right, front and back, 
according to the position of the respective joysticks 415. 
These outputs can be seen on control box 400. Unprocessed 
outputs are also preferably provided in the form of a direct 
left 434, a direct right 436, a direct front 4 38 and a direct 

30 back 440 output, which are useful in some applications where 
the mixing panel is used downstream of the control box, and 
the audio engineer then has the ability to mix processed left 
and/ or right outputs, with unprocessed outputs, when desired. 

Figures 12 (a) -12(d) form a schematic diagram of the 

35 invention, particularly as it may be used with respect to the 
joystick embodiment. Turning now to Figures 12 (a) -(d), 
twenty-four inputs are shown at numerals AOS- 1 ^through 405-24. 
Each input 405 is coupled to an input control circuit 404„, 
each associated with an input 4 05. Since, in this embodiment.,. 
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there are twenty-four inputs 405, there are twenty-four input 
control circuits 404-1 through 404-24. However, only one of 
which, namely 404-1, is shown in detail, it being understood 
that the others, namely 404-2 through 404-24, are preferably 
5 of the same design as 404-1. The input control circuitry 4 04 
is responsive to the position of its associated joystick for 
the purpose of distributing the incoming signal at input 405, 
on to bus 4 08. Each joystick provides conventional X and Y dc 
Voltage signals indicative of the position of the joystick 
10 which signal's are converted to digital data, the data being 
used to address six look-up tables, a look-up table being 
associated with each of the voltage controlled amplifiers 
(VCA's) 407 which comprise an input circuit 404. The value in 
the table for a particular X and Y coordinates of the joystick 
15 indicate the gain of its associated VCA 407. The digital 
output of the look-up table is converted to an analog signal 
for its associated VCA 407. Each VCA 407 has a gain between 
unity and zero, depending on the value of the analog control 
voltage signal. Thus, from 0% to 100% of the signal being 
2 0 input at input 4 05-1, for example, finds its way on to the 
various lines forming bus 4 08 depending upon the position of 
joystick 415-1. Similarly, input 405-2 has its input 
distributed amongst the various lines making up bus 408, 
depending upon the position of its joystick 415-2. The same 
2 5 thing is true for the remaining inputs and remaining 
joysticks. Also, as will be seen, the distribution of the 
signals is controlled somewhat by the position of a switch 
409, whose function will be discussed in due course. 

The currently preferred values in the look-up tables 
30 are tabulated in Tables G-X. The data in Tables G-X 
correspond to the action of VCA's 407L, 407F, 407R, 407BL, 
407M, and 407BR, assuming that the position of the joystick 
is resolved to 5 bits in its x-axis and to 5 bits in its y- 
axis. As such, the position of the joystick can be resolved 
35 to one of 32 positions along an x-axis and to one^ of 32 
positions along a y-axis. Hence, each Table has 32 by 32 
entries, corresponding to the possible * position of the 
joystick. In practice, the x and y position information is 
preferably resolved to greater precision (for example", to 320- 
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x 320) and the data points are interpolated for those x and 
y coordinate positions between the data points set forth in 
the Tables. With this data, the joysticks can move the sound 
source to the front, back, to the right or left by moving the 
5 joystick to a corresponding position. The data in Tables A-F, 
and graphically depicted in Figures 13(a)-(f), are 
conceptually similar but the full left and full right 
positions are in the left and right quadrants of the joystick. 
These Tables and Figures show the percentage of the signal 

10 input at an input 405 which finds its way onto the various 
lines comprising bus 408, where the various signals on each 
line of the bus from different input circuits 404 are summed 
together. The Tables G-X show the percentages for various 
positions of a joystick 415 as it is moved left and right and 

15 front and rear. Table G, which is associated with VGA 407L, 
indicates that VCA 4 07L outputs 100% of the inputted signal 
when its associated joystick is moved to the position maximum 
left and maximum front. The outputted signal from VCA 4 07L 
drops to under 20% of the inputted signal when the joystick 

2 0 is moved to its maximum right, maximum back (or rear) 
position. Other positions of the joystick cause VCA 407L to 
output the indicated percentage of the inputted signal at 405. 

VCA 407L, receives a control voltage input VC X -L for 
controlling the amount of the input signal at 4 05 which finds 

25 its way onto bus 408L. Similarly, VCA 407R controls the amount 
of input signal at 405 which finds its way onto line 408R. The 
same thing is true for VCA's 407F, 407R, etc. The voltage 
control amplifiers 407 in the remaining input circuits 404-2 
through 404-24, are also coupled to bus 408 in a similar 

30 fashion and, thus, the current supplied by the voltage control 
amplifiers 4 07 are summed onto that bus structure. Thus, the 
various input signals 4 05-1 through 4 05-24 are steered, or 
mapped, onto the appropriate line of bus 408 depending upon 
the position of the respective joysticks 415-1 through 415-24. 

35 The signals on lines comprising bus 4 08 are then converted 
back into voltages by summing amplifiers 4 09, each of which 
is identified by subscript letter or letters corresponding to 
the line of bus with which they are coupled. The outputs of 
summing amplifiers 409L, 409R, and 409F are applied directly. 
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to three of the four direct outputs, 434, 436 and 438, 
respectively. The direct back output 440 is summed from the 
output of the summing amplifiers 409CDL, 409CDR, 409EL and 
409ER. 

5 Tables G-X are the preferred tables when the 

invention is used for cinema production* 

Before going deeper into the description, it might 
be helpful to the reader to explain some of the terminology, 
particularly the subscripts which are being used in this 

10 description. The reader has probably noted that the letter "L" 
is associated with the left channel, the letter "R" with the 
right channel, the letter "F" with front and the letter "M" 
with mono-compatibility. The letters "BL" mean back left and 
the letters "BR" mean back right- The perceived sound 

15 locations for L, R, F, BL and BR are shown in Figure 2, for 
example. The letter M c" is associated with the C-mode of 
operation, which was briefly discussed with reference to 
Figures 10(a) and (b) . There is also a D-mode of operation and 
an E-mode of operation in the embodiment of the invention now 

20 being described. The mode of each input 4 05 is controllable 
from a controller 410. See, for example, Figure 11 where for 
each joystick 415 there is a mode switch 411 which can be 
repeatedly pushed to change from mode C, to mode D, to mode 
EL, to mode ER, and then back to mode C. In mode C and D, 

25 switches 409L and 409R are in the position shown in Figure 
12(c). Switch 409L changes position when in mode EL, while 
switch 4 09R changes position when in mode ER Light emitting 
diodes (LED's) 412L and 412R of Figure 11, report the mode in 
which the controller is for each channel. For example, LED's 

30 412L and 412R may both be amber while in mode C, may both be 
green while in mode D, while in mode EL the left LED (412L) 
would be preferably red while the right LED (412R) would be 
off, and an opposite convention when in mode ER 

Mode C is preferably used for live microphone array 
35 recording of instruments, groups, ensembles, choirs, sound 
effects and natural sounds, where a microphone array can be 
placed at the locations shown in Figure 5~.~ Mode D is a 
directional mode which places a mono-source to any desired 
location within the listeners conceptual image space, shown-- 
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in Figure 4, for example. Applications on mode D are in 
multi-track mix-down, commercial program production, dialogue 
and sound effects , and concert sound reinforcement. 

Mode E expands a stereo source and, therefore, each 
5 input is associated with either a left channel (mode EL) , or 
a right channel (mode ER) of the stereo source. This mode can 
be used to simulate stereo from mono-sources and allows 
placement within the listener's conceptual image space, as 
previously discussed. Its applications are the same as for 
10 mode D. 

Returning to Figures 12(a) and (b) , the output from 
summing amplifiers 409CDL and 409CDR, correspond to the back 
left and back right signals for the C and D-modes. The signals 
are applied to a stereo analog-to-digital converter 412CD 

15 which multiplexes its output onto line 414CD. Similarly, 
stereo analog-to-digital converter 412E takes the E-mode back 
left and E-mode back right analog data , and converts it to 
multiplexed digital data on line 414E. The digital data on 
lines 414CD and 414E are applied to digital sound processors 

20 (DSP's) 450 which will be subsequently described with 
reference to Figures 14(a) and (b) . The audio processors may 
be identical, and may receive external data for the purpose 
of determining whether they operate in the C-mode, D-mode or 
E-mode, as will be described. The programming of the digital 

25 sound processor (DSP) 450 can be done at a mask level or it 
can be programmed in a manner known in the art by a 
microprocessor attached to a port on DSP 450 which 
microprocessor then downloads data stored in E proms or ROM's 
into the DSP 4 50 during initial power-up of the apparatus • The 

30 current preference is to use model 56001 DSP's manufactured 
by Motorolla. In practicing the present invention, it is 
preferred to download the programming into the DSP 4 50 chips 
using a microprocessor, since that makes it easier to 
implement design changes should that become necessary. In due 

35 course, it will be preferred to use mask level programming 
since that should make the device more economical to produce. 
In any event, the programming emulates the digital logic shown 
in Figures 14(a) and (b) . The outputs from the DSP 450 chips 
are again converted back to analog signals by stereo digita.l 

SDOCID:<WO 9416538A1_L> 

X 



WO 94/16538 PCT/US93/12688 

33 

to analog converters 418CD and 418E. The outputs of stereo 
digital to analog converters 418CD and 418E are summed along 
with outputs from the mono compatibility channel, the front 
channel 409F, the right channel 409R and the left channel 
5 409L, through summing resistors 419, before being applied to 
summing amplifiers 425L and 425R and thence to processed 
stereo outputs 430 and 4 32. The summing resistors 419 all 
preferably have the same value. The mono compatibility signal 
from summing amplifier 409M is applied to a low and high-pass 
10 equalization circuit which preferably has a low q typically 
q on ' the order of .2 or .3, centered around 1,700 Hz. 
Equalization circuit 422 typically has a 6 dB loss at 1,700 
Hz. 

In the D-mode, processed directional enhancement 

15 information, i.e., the conditioning signal C, is added (and 
subtracted) to the output channels. This information is band 
pass filtered by filters 456 and 457, for example, so that it 
peaks in the mid-range. If the enhanced left and right signals 
are summed together to form a L+R mono signal, this can show 

20 up as a notch in the spectrum in that mid-range area. To 
counteract that effect, the mono compatibility signal is 
preferably used which has a notch which is the antithesis of 
the mid-range notch and which, in effect, balances the output 
spectrum of a L.+R mono signal. When the joystick is in the 

25 center, equal amounts of the conditioning signals go to the 
left and right channels and when those channels are summed to 
form the R+L signal, the conditioning signal is effectively 
canceled out since it was originally added to one channel and 
subtracted from the other channel. So, with a back-centered 

30 joystick, some mono compatibility signal is needed, and can 
be seen in Table E and in Figure 13(e) , for example, where the 
VC X -M input to VCA 407M goes to approximately -5 dB (60%) when 
the joystick is centered between left and right, but pulled 
all the way towards the back. It should be understood by the 

35 reader that spatial enhancement and mono compatibility of the 
perceived conceptual image space and collapsed sound field is 
achieved within a surprisingly very narrow difference range 
of a few dB. This is the nature of human hearing with respect 
to the psychoacoustic phenomena toward which this invention- 
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is directed. 

Turning now to Figures 14(a) and 14(b), these 
figures form a sound processor logic diagram similar to data 
Figures 10(a) and (b) , but with a number of changes, the most 
5 important of which follows: 

(1) There is no need for an automatic control 
circuit 378, as shown in Figures 10(a) and (b) since, in this 
embodiment, the amount of expansion (the amount of spatial 
effects which are added) is controlled manually by the 
10 position of the joystick 415. There is also no variable gain 
circuit such as 330 (Figure 10(b)) since, the amount of gain 
is controlled by the position of the joystick 415 as it, in 
turn, controls the gain of the various VCA's 407 (Figures 12(a) 
and (c) ) . 

15 (2) The embodiment of Figures 10(a) and (b) operated 

in either a C-mode or a non-C expansion mode (which is 
identified as mode E in Figures 12(a), 12(b), 14(a) and 
14(b)). The embodiment of Figures 12(a), 12(b), 14(a) and 
14(b) also include another mode (mode D) which, as will be 

2 0 seen, causes certain changes to be made to the audio processor 
logic of Figures 14(a) and (b) compared to the audio processor 
logic of Figures 10(a) and (b) . Referring again to Figures 
14(a) and (b) , the incoming serial data which was multiplexed 
onto line 414, is de-multiplexed by formatter 451 . Preferably, 

25 the stereo A to D converters 412 (see Figure 12(d)) are 
Crystal Semiconductor model CS5328 chips, while the stereo D 
to A converters 418 (see Figure 12(d)) are Crystal 
Semiconductor model CS4328 chips and, therefore, formatters 
451 and 470 would be set up to de-multiplex and multiplex the 

30 left and right digital channel information in a matter 
appropriate for those chips. The left and right digital data 
is separated onto buses 4 52 and 4 53, and is communicated to, 
for example, a subtractor 454, to produce a R-L signal. The 
R-L signal passes through the low pass and high pass filters 

35 456 and 457 and the time delay circuit 458, when the circuit 
is connected in the E-mode as depicted Joy switch_455 (which 
is controlled by an E-mode control signal) . When in the 
D-mode, switches 4 55 take the other position shown in 
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schematic diagram and, therefore, the left channel digital 
data on line 452 is passed through the top set of high pass 
and low pass filters 456 and 457 and the time delay circuit 
458, while the right channel digital data on line 453 is 
5 directed through the lower set of high pass and low pass 
filter 456 and 457 and time delay circuit 458. There is no 
need to control the amplitude of the signal from the time 
delay circuits 458, as was done in the embodiment of Figures 
10(a) and (b) , because of the fact that the amplitude the 

10 signals are being controlled at the input control circuits 404 
of Figures 12(a) and (c) and the amount of processing is being 
controlled at the input by the position of joysticks 415 (see 
Figure 11) . The outputs of time delay circuits 458 are applied 
to respective left and right channel equalization circuits 

15 460. The output of the left equalization circuit 4601, is 
applied via a switch 462 to an input of formatter 470. The 
output of the right equalization circuit 460R is applied via 
a switch 462 and an invertor 465 to an input of formatter 470. 
As previously indicated, formatter 470 multiplexes the signals 

2 0 received at its inputs onto serial output line 416. 

Time delay circuits 458 preferably add a time delay 
of 0.2 millisecond. It is to be appreciated that the DSPs' 
450 and their associated A to D and D to A converters 412E, 
412CD, 418E and 418CD have inherent delays of about 0.8 

2 5 millesecond. Thus, the total delay produced by the inherent 

delay of the circuit devices and the added delay in time delay 
circuits 4 58 total about 1 millisecond compared to the left 
and right analog signals from amplifiers 409L and 409R. 

Switches 4 62 are shown in the C-mode position, which 

3 0 has been previously described. When in the D-mode or the 

E-mode, the switches 4 62 change position so as to communicate 
the outputs of the equalizers 4 60 to the formatter and 
invertor 4 65, as opposed to communicating the unfiltered 
4 signals on lines 452 and 453 which is done when in the C-mode . 

35 The inversion which occurs in the right channel information 
by invertor 465, is effectively done by subtracter 332 in the 
embodiment of Figures 10(a) and (b) . It ls~ to be" recalled that 
subtracter 332 subtracts the right channel conditioning 
information C R (from equalizer 312) , from the"" right channel- 
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audio data. In the embodiment of Figures 14(a) and (b) , the 
right channel conditioning signal is inverted by invertor 465. 
It is then communicated via formatter 470 and the stereo 
digital to analog converter 418 (see Figure 12(d)) onto a 
5 summing bus, where it is summed through resistors 419, along 
with the right channel information from summing amp 409R # into 
an input of summing amp 4 2 5R. The left channel conditioning 
signal C L is communicated, without inversion, via formatter 
470 and the stereo digital to analog converter 418 (see Figure 

10 12(d)) onto a summing bus where it is summed through resistors 
419, along with the left channel information from summing amp 
4 09L, into an input of summing amp 42 5L.^ 

The invention has been described with respect to 
both analog and digital implementations, and with respect to 

15 several modes of operation. The broadcast mode, mode B, uses 
a feedback loop to control the amount of processing being 
added to stereo signals. In the C, D and E-modes, the amount 
of processing being added is controlled manually. In the final 
embodiment disclosed, the amount of processing is input 

20 controlled by joystick. In the C-mode, the conditioning signal 
which is added and subtracted from the left and right channel 
data, undergoes little or no processing. Indeed, no processing 
is required if the conditioning signal is organically produced 
by the location of microphone "C" in Figure 5. In the mode C 

25 operation described with reference to Figures lO(a), 10(b), 
12(a), 12(b), 14(a) and 14(b), the conditioning signal 
bypasses the high pass/ low pass filters and the time delay 
circuitry. On the other hand, in the D and E-modes, the 
conditioning signals are synthesized by the high pass/ low pass 

3 0 filter .and, preferably, the time delay. In the E-mode it is 
a R-L signal which is subjected to filtering, whereas in the 
D-mode, the left and right signals are independently subjected 
to filtering, for the purpose of generating the conditioning 
signal C. 

35 As can be seen by reference to Figures 10(a) , 10(b) , 

14(a) and 14(b), the amount of time delay is controllable. 
Indeed, some practicing the instant invention may do away with 
time delay altogether. However, time delay is preferably 
inserted to de-correlate the signal exiting the filters from 
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the left and right channel information to help ensure 
monocompatibility . Unfortunately, comb filtering effects can 
be encountered, but these seem to be subjectively minimized 
by filters 456, 457, and 460. In order to minimize such 
5 effects in the B, D and E-modes of operation, it is preferred 
to use a time delay circuit such as 328 (Figures 10(a) and 
(b)) or 458 (Figures 14(a) and (b)). In the organic mode 
(Figure 6) , the time delay is organically present due to the 
placement of microphone "C" further from the sound source than 

10 microphones "L" and "R" . 

The present invention can be used to add spatial 
effects to sound for both the purposes of recording, 
broadcasting, or a public performance. If the spatial effects 
of the invention are used, for example, in audio processing 

15 at the time of mixing down a multi-track recording to stereo 
for the purposes of release of tapes, records or digital 
discs, when the tape, record or digital disc is played back 
on conventional stereo equipment, the enhanced spatial effects 
will be perceived by the listener. Thus, there is no need for 

20 additional spatial treatment of the sound after it has been 
broadcast or after it has been mixed down and recorded for 
public distribution on tapes, records, digital discs, etc. 
That is to say, there is no need for the addition of spatial 
effects at the receiving apparatus or on a home stereo set. 

25 The spatial effects will be perceived by the listener whether 
they are broadcast or whether they are heard from a 
prerecorded tape, record or digital disc, so long as the 
present invention was used in the mixdown or in the broadcast 
process. 

3 0 The present invention is also mono compatible. That 

is to say, if a person listens to a L=R signal, for example, 
the output at 430 and 432, no artifacts of the process will 
be perceived by the listener. This is important for 
television, FM and AM stereo broadcast as the greater populace 
35 will continue to listen to mono signals for some time to come. 
The present invention, while adding spatial expansion to the 
stereo signals, does not induce artifacts of the process in 
the L+R signal. 

Digital delay devices can delay any "if requency for — 
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any time length. Linear digital delays delay all frequencies 
by the same duration. Group digital delays can delay different 
groups of frequencies by different durations. 

The present invention preferably uses linear digital 
5 delay devices because the effect works using those devices and 
because they are less expensive than are group devices. 
However, group devices may be used, if desired. 

The previously described embodiments, and 
particularly the embodiments of Figures 7 through 14(a) and 

10 (b) will be quite useful in the professional audio industry 
in the various applications previously mentioned. However, 
those embodiments tend to be too complex^ for convenient use 
in consumer electronics equipment such as might be used in the 
home. Thus, there is a need for embodiment which may be 

15 conveniently used in consumer quality electronics equipment, 
and which preferably can be embodied in an easily manufactured 
chip- Such an embodiment is disclosed with reference to Figure 
15. 



Sixth Embodiment 

20 Figure 15 is a schematic diagram of a sixth 

embodiment of the invention, which embodiment can be 
relatively easily implemented using a single semiconductor 
chip and which may be used in consumer quality electronics 
equipment, including stereo reproduction devices, television 

25 receivers, stereo radios and personal computers, for example, 
to enhance stereophonically recorded or broadcast music and 
sounds. 

In Figure 15, the circuit 500 has two inputs, 501 
and 502, for the left and right audio channels found within a 

30 typical consumer quality electronic apparatus. The signals at 
input 501 are communicated to two operational amplifiers, 
namely amplifiers 504 and 505. The signals at input 502 are 
communicated to two operational amplifiers, namely 504 and 
506. The left and right channels are subtracted from each 

35 other at amplifier 504 which produces an output L-R. That 
output is communicated to a potentiometer _503 which 
communicates a portion (depending upon the position of 
potentiometer 503) of the L-R signal back through a band pass 
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filter 507 formed by a conventional capacitor and resistor 
network . 

Filter 507, in addition to band-passing the output 
from amplifier 504, also adds some frequency dependent 
5 time-delay (phase delay) to that signal, which is subsequently 
applied to an input of amplifier 508. The output of amplifier 
508 is the conditioning signal, C, which appears on line 509. 
The conditioning signal, C, is added to the left channel 
information at amplifier 505 and is subtracted from the right 
10 channel information at amplifier 506 and thence output as 
spatially enhanced left and right audio channels 511 and 512, 
respectively. 

Filter 507 preferably has a center frequency of 500 
Hz with 6 dB/octave slopes. As previously mentioned, it has 

15 been determined that the center frequency can fall within the 
range of about 3 00 Hz to 3,000 Hz. 

Outputs 511 and 512 may then be conveyed to the 
inputs of the power amplifier of the consumer quality audio 
apparatus and thence to loudspeakers, in the usual fashion. 

20 The listener controls the amount of enhancement added by 
adjusting potentiometer 503. If the wiper of potentiometer 503 
is put to the ground side, then the stereo audio program will 
be heard with its usual un-enhanced sound. However, as the 
wiper of potentiometer 503 is adjusted to communicate more and 

25 more of the L-R signal to the band pass filter 507, more and 
more additional spatially processed stereo is perceived by the 
listener. For example, if the listener happens to be watching 
a sporting contest on television which is broadcast with 
stereo sound, by adjusting potentiometer 503, the listener 

30 will start to perceive that he or she is actually sitting in 
the stadium where the sporting contest is occurring due to 
the additional spatial effects which are perceived and 
interpreted by the listener. 

If the signals at input is mono (i.e., R=L) , then 

35 no artifacts of the present process will be perceived by the 
listener. If enhancement of a monaural signal (i.e., L.=R) is 
desired > then that may be obtained by the eighth embodiment, 
which will be subsequently described. 

The circuitry of Figure 15 is shown with* essentially- 
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discreet components with the exception of the amplifiers, 
which are preferably National Semiconductor model LM837 
devices. However, those skilled in the art, will appreciate 
that all (or most) of the circuit 500 can be reduced to a 
5 single silicon chip, if desired. Those skilled in the art will 
also appreciate that capacitors CI and C2 in band pass filter 
507 will tend to be rather large if implemented on the chip 
and, therefore, it may be desirable to provide appropriate 
pin-outs from the chip for those devices and to use discreet 
10 devices for capacitors CI and C2 . That is basically a matter 
of design choice. 



Seventh Embodiment 

The sixth embodiment of the invention, which was 
described with reference to Figure 15, may be used in consumer 
15 quality electronics equipment. However, as has been made 
clear by discussion relative to the earlier described 
embodiments, the present invention can also be used 
professionally when recording music (and other audio material) 
and/or when broadcasting music (and other audio material) . 

2 0 Thus, the present invention can be used to increase the 

spatial image of music (or other recorded material) before or 
after being recorded on disc, or before or after being 
transmitted by a broadcaster or just before being heard by a 
listener. It is preferable, however, that when music or other 
25 sounds are spatially enhanced in accordance with the present 
invention, that the material not be overly enhanced. 
Previously described embodiments of the present invention 
include an automatic control circuit 118 which regulates the 
amount of the conditioning signal generated in order to keep 

3 0 the energy ratio between the sum and difference of the two 

spatially enhanced outputs substantially equal. In this 
regard, the second and third embodiments of the invention 
include control circuit 118 which effectively controls the 
amount of expansion which occurs. 
3 5 The seventh embodiment of -the invention is a 

modified version of the sixth embodiment and includes an 
automatic control circuit 518 for automatically controlling 
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the amount of spatial expansion which occurs. The seventh 
embodiment is described with reference to Figures 16, 17A and 
17B and is also intended to be used in consumer quality 
electronics equipment as is the case with the sixth 
5 embodiment. However, since the stereo signals being supplied 
to the circuit may already be spatially enhanced (the signals 
can be spatially enhanced during production or during 
broadcasting, for example), the seventh embodiment includes 
the automatic control circuit 518 to limit the amount of 
10 spatial energy added by the circuit of this seventh 
embodiment. 

Figure 16 is a block diagram and Figures 17A and 
17B may be joined together to form a schematic diagram. The 
seventh embodiment is quite similar to the sixth embodiment, 

15 and, therefore, common reference numerals are used for common 
elements. Indeed, the biggest change is the addition of the 
aforementioned automatic control circuit 518 which controls 
the amount of spatial enhancement which the circuit generates. 
Another change is the provision of a stereo synthesis mode of 

20 operation. If the music or other audio material occurring at 
the inputs 501 and 502 already has a high degree of spatial 
energy because the music or other audio material has 
previously been processed in accordance with the present 
invention before it was received by the circuit of Figures 16, 

25 17A and 17B, then it is not desirable to add further spatial 
enhancement in this circuit. However, if the inputs 501 and 
502 receive stereo music or other audio stereo material which 
has not been previously spatially enhanced, then the circuitry 
of Figures 17A and 17B should add the desired spatial 

30 enhancement. Thus, the control system 518 of the present 
invention acts to control the amount of the spatial 
enhancement added. If the incoming stereo music or sound is 
already spatially enhanced, little or no additional spatial 
enhancement is provided. If the incoming music or sounds have 

35 not been previously spatially enhanced, then the control 
system permits the spatial enhancement to occur. If the 
incoming material is monaural, the stereo sounds may be 
synthesized. 

The control system 518 of the present embodiment is- 
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conceptually similar to the automatic control circuitry 118 
previously described with reference to Figures 7A, 7B and 8, 
but it is nevertheless described here with respect to this 
presently preferred consumer electronics embodiment of the 
5 invention. 

In the sixth embodiment, the amount of spatial 
enhancement which occurs is controlled by using a 
potentiometer 503, which controls the amplitude of the 
conditioning signal, C, which appears on line 509 . In this 

10 seventh embodiment, instead of using a manual potentiometer 
503, the magnitude of the conditioning signal C is controlled 
by a voltage controlled amplifier 503' which is responsive to 
a control input on line 510. The voltage controlled amplifier 
503 ' is preferably a model 2151 device sold by That 

15 Corporation or its equivalent. The conditioning signal is 
output on the line 560, which output can be utilized to drive 
an ambience or surround speaker, often located to the rear of 
the listener. 

The outputs on lines 511 and 512 are sampled and 

20 added in circuitry 522 and subtracted in circuitry 520 to form 
sum and difference signals on lines 530 and 528, respectively. 
These sum and difference signals are applied to inputs of RMS 
detectors 524 and 52 6- The RMS detectors are preferably model 
2252 devices currently manufactured by That Corporation. The 

25 output of the RMS detectors are applied to a comparator 550 
whose output is coupled to a current source 551 (Figure 16) . 
The output of the current source is applied via a diode 552 
to an amplifier 544. In Figures 17A and 17B, the current 
source 551 and amplifier 550 are provided by a single device 

30 which is called an operational transconductance amplifier 550, 
551, which serves as a current source. When connected as 
shown in Figures 17 A and 17B, it can source up to 10 microamps 
of current. 

Potentiometer 53 4 whose wiper is connected to one 
35 side of a resistor-capacitor network 553 allows the user to 
control the amount of spatial enhancement which they desire 
the circuit to produce. Resistor-capacitor network , 553 
controls the rate at which the spatial enhancement can be 
changed by the output of current generator 551. The current 
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flowing through diode 552 alters the voltage across 
resistor-capacitor network 553 which is fed to amplifier 554. 
The automatic control circuit 518 functions to limit the 
amount of spatial energy which the circuit can add to the 
5 signals on the left and right channels, that is to say, it 
does not allow the user to overly spatially enhance the music 
material. The user can, however, use less spatial 

enhancement, if they so desire, by adjusting potentiometer 
534. The output from the resistor-capacitor network 553 is 

10 applied via a switch 581 to a high impedance buffer amplifier 
554 whose output goes to the control input of the voltage 
controlled amplifier 503'. Resistor-capacitor network 553 in 
combination with the current source 551, controls the 
ballistics of the circuitry, that is, the number of decibels 

15 per second change invoked through voltage controlled amplifier 
503'. 

In addition to adding spatial enhancement to stereo 
material, the present invention can also be used to synthesize 
stereo when the music or other sounds inputted at inputs 501 

20 and 502 is monaural material. In order to synthesize stereo 
from monaural information, a signal on line 580 causes 
switches 581, 582 and 583 to change position from that shown 
in the drawing. The input to buffer amplifier 554 is then a 
bias voltage which is preferably provided by a voltage divider 

25 network 584. In this stereo synthesis mode, differential 
amplifier 504 has one of its inputs grounded via switch 582 
and the other input continues to receive monaural information, 
which is assumed to be applied to both inputs 501 and 502. 
In the figures, the left input to differential amplifier 504 

30 is shown as being grounded via switch 582, but it could be the 
other input, if desired. Additionally, switch 583 adjusts the 
gain of amplifier 502 in order to keep the output channels in 
subjective balance in the stereo synthesis mode. 

The various semiconductor devices shown in the 

35 schematic diagram of Figures 17A and 17B have previously been 
implemented as bipolar semiconductor devices. It is believed 
that all of those devices, plus most of- ' the" resistors and 
capacitors, can be implemented together on a large single 
bipolar semiconductor chip. Of course, some of the components— 
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including potentiometers and relatively large capacitors are 
best implemented off the chip and those elements are therefore 
shown within dotted lines on Figure 16. Additionally, 
assuming that those elements not enclosed within dotted lines 
5 on Figure 16 are implemented on a single chip, then the single 
chip would have pin-outs as indicated by the numerals in small 
boxes, which numerals run between 1 and 18. Those skilled in 
the art will appreciate, of course, that such a single chip 
IC can be very conveniently packaged. 

10 With respect to the parts which would not be 

implemented on a single chip, switch 530 mutes or turns off 
the conditioning signal, so that no spatial enhancement is 
added by circuitry 518 when switch 530 is opened. Switch 53 0 
can be a manual switch as shown, or it can be an electrically 

15 operated switch, such as a transistor. The signal on line 580 
might be controlled by the stereo detection circuitry of a 
conventional radio, for example, to change the positions of 
switches 581, 582 and 583, when no stereo signal is present, 
to cause stereo to be synthesized. When monaural information 

20 is only available at inputs 501 and 502, this circuit should 
desirably not try to spatially enhance the signal in the 
manner as done with a stereo signal, i.e., without changing 
the signal on line 580 since the circuitry enhances the 
difference information which, in terms of monaural 

25 information, is noise. 

Capacitors 516 and 517 form high pass filter that 
limit the action of the automatic control circuit 518 to those 
frequencies above the extremely low bass, i.e., above 100 Hz. 
This is desirable because no significant spatial information 

30 exists below 100 Hz. 

As in the case of the embodiment of Figure 15, the 
band pass filter 507 comprises relatively large capacitors 
which are preferably implemented off the chip, given their 
sizes. Band pass filter 507 is preferably centered at 500 Hz. 

35 Capacitors 516 and 517, which couple outputs 511 and 512 to 
the differencing 520 and adding 522 circuit arrangements, are 
also preferably implemented off-chip given- their size. Also, 
RMS detectors 524 and 526 have some relatively large 
capacitors associated with them which are similarly preferably, 
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implemented off-chip. The distortion trim pot 585 for VCA 
503' is also implemented off -chip. However, a large number 
of the components can be implemented on a single chip, and 
therefore, this embodiment of the present invention can be 
5 implemented very conveniently and inexpensively for consumer 
grade stereo audio equipment. 



Eighth Embodiment 

The sixth embodiment of the invention, which was 
described with reference to Figure 15 , is monaural compatible 

10 in that if a monaural signal (i.e. R— L) is applied at its 
inputs 501 and 502, no artifacts of the present sound imaging 
enhancement method ;will be perceived by the listener. That 
is to say, when a stereo signal input is supplied at the 
inputs 501, 502, image enhancement occurs, but if a monaural 

15 signal is applied at those inputs, no image enhancement 
occurs. However, the embodiment of Figure 15 (or of Figures 
16, 17a and 17b) can be used to enhance monaural information 
if it is (or they are) modified as shown in Figure 18. 

In Figure 18, it is assumed that the circuitry of 

20 Figure 15 (or of Figures 16, 17a and 17b) has been implemented 
as a chip 802, with the exception of the potentiometer 503 
shown in Figure 15 (potentiometer 534 of Figures 16 or 17a) . 
Thus, inputs 82 2 and 824 correspond to the inputs +L and +R 
which appear at inputs 501 and 502. Two potentiometers 804 

25 and 806 are coupled between inputs 814, 816 and inputs 822, 
824. These potentiometers are preferably wired and gained for 
counter operation, i.e., the resistance of one increases as 
the other decreases with movement of the gained wipers. 
Potentiditieter 503 appears as potentiometer 812 in Figure 18. 

30 Additional potentiometers 808 and 810 have been connected at 
the outputs 812 and 828 (which correspond to outputs 511 and 
512 in the environment of Figure 15) . These potentiometers 
are also gained for counter operation as explained above with 
reference to potentiometers 804 and 806. 

3 5 The action of the manipulation apparatus and method 

depends upon the existence of a difference betweerr the input 
signals applied at inputs 822 and 824 (which correspond to +1; 
and +R on Figure 15) . The difference may be spectral and/or-- 
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temporal. By spectral differences, it is meant that the 
distribution of energy over the sound spectrum differs between 
the left and right signals. By temporal differences, it is 
meant that the synchronization of the two input signals may 
5 be offset in time with respect to the period of each signal. 

It is this combination of spectral and temporal 
differences, resulting from the nature of live stereophonic 
recording and multi-track stereophonic recording, that cause 
the generation of the conditioning signal, C, which has been 

10 previously described. The conditioning signal appears, for 
example, on line 509 of the embodiment of Figure 15. The 
conditioning signal can cause the sound emanating from two 
loudspeakers, located to the left and right front of the 
listener and coupled via amplifiers to outputs 818 and 820, 

15 to actually appear, to the listener, to originate from behind 
the listener, or at other locations in the listener's audio 
space, depending upon the content of the conditioning signal, 
C. A monophonic signal, when applied to the left and right 
channels, has equal spectral and temporal quality, even under 

20 broad-band conditions. Thus, in that event, both inputs 822 
and 824 receive the same signal. In that circumstance, and 
independently of the position of potentiometer 812, the 
enhancement effect of the described manipulation system is at 
its minimum and approaches zero, provided that the resistance 

25 added by potentiometers 804 and 806 is the same in both the 
left and right inputs. 

Differences Due to Broad Spectral Imbalance 

However, by changing the resistance values of 
potentiometers 804 and 806, so that the resistances are no 

30 longer equal, then the signals appearing at inputs 822 and 824 
are no longer the same signal. In effect, a broad-band 
spectral difference is realized at those two inputs. This 
difference will generate the conditioning signal, C, at line 
509 inside device 802. It should be noted, however, that the 

35 conditioning signal is presented at outputs. 826-and__828 at the 
anti-phasing position, independent of the difference in broad- 
band spectral content of inputs 82 2 and 82 4 by- virtue of the 
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relative position of controls 804 and 806. Inside the device 
802, the signals at inputs 822 and 824 are also routed 
directly to outputs 82 6 and 828, respectively. Thus, a broad- 
band spectral difference is realized by changing the value of 
5 potentiometer 804 compared to 806. This results not only in 
a conditioning signal, C, being presented at outputs 826 and 
828 with egual intensity and opposite polarity, but also 
produces a difference in signal intensity at outputs 826 and 
828 proportional to the difference represented at inputs 822 
10 and 824 by virtue of the change in the resistances of 
potentiometers 804 and 806. Therefore, when potentiometer 804 
is counter-rotated compared potentiometer 806, for example, 
several things occur simultaneously. 

Assume a monophonic signal is equally applied to 
15 inputs 814 and 816. Further assume that potentiometers 804 
and 806 are adjusted so as to provide an equal signal to input 
822 and input 824. Under such conditions the listener will 
hear a phantom image midway and forward between the two 
loudspeakers. This is the usual monophonic effect. 
20 Assume a monophonic signal is equally applied to 

inputs 814 and 816. Further assume potentiometers 804 and 806 
are adjusted so as to provide a larger signal at input 822 and 
a smaller signal at input 824. In that event the listener 
will hear (i) a louder signal from the left loudspeaker and 
25 a softer signal from the right loudspeaker by virtue of that 
part of the circuit that directly routes input signals to the 
outputs, and the listener will simultaneously hear (i) a sound 
at the antiphasic position produced by the left and right 
loudspeakers by virtue of that part of the circuit that routes 
30 half of the conditioning signal, C, inverted, to the right 
loudspeaker. Since this is a synchronistic situation, the 
superposition principle, the precedence effect and temporal 
fusion all act together so as to blend these two distinctly 
produced signals into one homogeneous signal which upon being 
35 transduced into sound over loudspeakers will, within the 
listening experience, cause the listener to experience a 
virtual image to "appear" beyond and to- the ~outside of the 
left loudspeaker's physical location. 

The extent to which potentiometers 804 "and 806 are-- 



WO 94/16538 PCT/US93/12688 

48 

counter rotated will move the virtual image, described above, 
to image along an arc or semicircle extending from the 
physical location of the left loudspeaker to the back of the 
listener's head. 

5 If potentiometers 804 and 806 of 800 are counter 

rotated in the opposite direction so that the right signal is 
greater than the left signal at the inputs 822, 824, an 
inverse situation from that described above will occur. Thus, 
the circuit of Figure 18 can be said to be symmetrical in its 

10 broad-band spectral imbalanced imaging abilities. 

To further illustrate the imaging characteristics 
of circuit of Figure 18, assume a monophonic (i.e., monaural) 
signal is supplied to inputs 814 and 816 at equal intensity. 
Further assume potentiometer 804 is adjusted to be fully OFF 

15 and potentiometer 806 is adjusted to be fully ON. Under such 
circumstance, the signal at input 822 will be routed directly 
to output 82 6 and reproduced over the left loudspeaker. 
Simultaneously a conditioning signal, C, will be generated by 
the circuitry and will be added to the left output signal 826 

20 and inverted at the right output 828. Since no right signal 
is present at input 824 no output signal will be directly 
routed through to it. Only the inverted conditioning signal 
will be present at output 828. Thus the left loudspeaker will 
reproduce the left input signal and half of the conditioning 

25 signal while the right loudspeaker will reproduce the other 
half of the conditioning signal, which, by definition, is 
inverted relative to the left signal. Under this condition, 
and depending upon the intensity of the conditioning signal, 
as adjusted by 812, the listener will hear the sound imaged 

30 at a point approximately 140 degrees from a center point 
(straight forward) of zero degrees. 

Differences Due To Selective Spectral Imbalance 

Assume a monophonic signal of a broad-band white 
noise is split with equal signals being applied to the inputs 
35 of an external equalizer device, and the output of each 
equalizer is connected to inputs 814 and 8116. -Under this 
case, it is possible to selectively adjust each equalizer so 
as to send peaked portions of the sound spectruTh to inputs 814 
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or 816. This arrangement can be said to provide a spectral 
difference. 

For purposes of explanation, further assume that the 
equalizer connected to input 814 is peaked at 1000 Hz and that 
5 the equalizer connected to input 816 is flat or non-peaked. 
With the conditioning signal control potentiometer 812 , set 
at a normal level, and loudspeakers connected to the outputs 
818 and 820 via amplifiers, a listener positioned between the 
two loudspeakers will hear all broad-band frequencies at a 

10 mid-point between the two loudspeakers and frequencies in the 
1000 Hz band image beyond the left loudspeaker at 
approximately 100 degrees left from center . 

In explanation, input 814 receives the peaked signal 
and input 816 the non-peaked input, both originating as broad- 

15 band noise. Since the difference between the two inputs 
occurs only in the 1000 Hz range, the side-chain of the 
circuit (which side chain generates the conditioning signal, 
C) , controlled by potentiometer 812, will contain only this 
narrow band of frequencies clustered about the 1000 Hz band. 

20 Output 818 will contain the peaked signal as it is passed 
through the circuit by its internal circuitry. Output 820 
will contain both the inverted signal from the peaked 
equalizer and the non-inverted signal directly from input 816 
as routed through the circuit. Note that the peaked frequency 

25 band is made to produce a signal at outputs 818 and 820 of 
equal intensity but opposite polarity. Since the intensity 
is equal, the image produced by such an arrangement is always 
at the antiphasic position. The fact that the experience of 
the listener is to hear the peaked frequency band to the left 

30 of the left loudspeaker and not to the back of the head can 
be best understood through a study of the phenomenon of 
Temporal Fusion . 

If the two equalizers used in the above example are 
connected in a opposite manner so that the right signal is 

35 peaked and the left signal is flat, the outputs of the 
circuits will display an inverse situation. The listener will 
experience an equal but opposite listening* experience with the 
peaked frequency band imaging to beyond the right loudspeaker. 
Thus the circuit 800 can be said to be symmetrical with- 
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respect to its differences due to selective spectral 
imbalance. 

All spectral differences between inputs 814 and 816 
behave in similar fashion to the above explanation. Multiple 
5 narrow-band differences, simultaneously formed on either side 
of the stereo pair, are amalgamated into a coherent sound 
field through the action of the human brain and the excitation 
of Temporal Fusion, 

Differences Due to Broad-Band Temporal Imbalance 

10 By introducing a broad-band time delay difference 

between inputs 822 and 824 a temporal imbalance then exists. 
This imbalance will generate the conditioning signal, C, at 
812, but notice that the conditioning signal is presented at 
outputs 826 and 828 at the antiphasic position independent of 

15 the temporal displacement at inputs 822 and 824 . 

Further, the signals at inputs 822 and 824 are also 
routed directly to outputs 826 and 828, respectively, as shown 
in Figure 15. Thus a broad-band temporal difference realized 
by delaying signals presented at input 822 and not at input 

20 824 results not only in a conditioning signal, C, at 812 being 
presented at outputs 82 6 and 828 with equal intensity and 
opposite polarity, but it also realizes a difference in time 
of the signal at outputs 826 and 828 proportional to the time 
difference presented at inputs 822 and 824. 

25 Assume a monophonic signal is equally applied to 

inputs 814 and 816. Further assume that potentiometers 804 
and 806 are adjusted so as to provide an equal signal to input 
822 and input 824. Under such conditions the listener will 
hear a phantom image midway and forward between the two 

3 0 loudspeakers . 

Assume a monophonic signal is equally applied to 
inputs 814 and 816 but that the signal applied to input 814 
is delayed with respect to the signal applied to input 816. 
Further assume potentiometers 804 and 806 are equally 

35 adjusted. Under this condition the listener will hear (i) a 
louder signal from the right loudspeaker, and er softer signal 
from the left loudspeaker by virtue of that part of the 
circuit that directly routes input signals to the output, and,. 
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the listener will simultaneously hear (ii) a sound at the 
antiphasic position produced by the left and right 
loudspeakers by virtue of that part of the circuit that routes 
half of the conditioning signal, C, inverted, to the right 
5 loudspeaker. Since this is a synchronistic situation, the 
superposition principle, temporal fusion, and especially the 
precedence effect all act together so as to blend these two 
distinctly produced signals into one homogeneous signal which 
upon being transduced into sound over loudspeakers will within 
10 the listening experience, cause the listener to experience a 
virtual image to "appear" beyond and to the outside of the 
left loudspeaker's physical location. 

The extent to which an adjustable delay is 
discontinuous (up to about 50 milliseconds) will move the 
15 virtual image, described above, to image along an arc or 
semicircle extending from the physical location of the left 
loudspeaker to the back of the listener's head. 

If input 816 is delayed with respect to input 814, 
an inverse situation from that described above will occur. 
20 Thus the circuit of Figure 18 can be said to be symmetrical 
in its broad-band temporal imbalanced imaging abilities. 

Differences Due to Selective Temporal Imbalance 

Assume a monophonic signal of broad-band white noise 
is split with equal signals applied to the inputs of two 
external equalizer devices, and the output of each equalizer 
connected to a delay device, and the output of each delay 
device being connected to the inputs of a second set of 
equalizers, the outputs of the second set of equalizers being 
connected to inputs 814 and 816. In this arrangement each 
equalizer may be adjusted so as to send peaked portions of the 
sound spectrum to be delayed, and after delay mixed back into 
the original broad-band white noise signal through the second 
equalizer set with a dip at 1000 Hz which is in opposition 
with the first equalizer's peak but adjusted so as to produce 
an equal intensity across the sound spectrum except with a 
portion of the spectrum at 1000 Hz delayed- with" respect to the 
other portions of the spectrum. This arrangement can be said 
to provide a selective temporal imbalance. " " 
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For purposes of explanation let us further assume 
that the equalizer-delay-equalizer arrangement described above 
is connected to inputs 814 and 816. Let us further assume 
that equalizer-delay-equalizer connected to input 814 is 
5 peaked at 1000 Hz and that the equalizer-delay-equalizer 
connected to input 816 is flat or non-peaked . With the 
conditioning signal control potentiometer 812 set at a normal 
level, and loudspeakers connected to the outputs 818 and 820 
(via amplifiers) , a listener positioned between the two 

10 loudspeakers will hear all broad-band frequencies in the 1000 
Hz band image beyond the right loudspeaker at approximately 
100 degrees right from center. 

In explanation, input 814 receives broad-band white 
noise with the 1000 Hz band delayed signal and input 816 the 

15 non-delayed input of broad-band white noise. Since the time- 
difference between the two inputs occurs only in the 1000 Hz 
range, the side-chain of 802, controlled by potentiometer 812, 
will contain only this narrow band of frequencies clustered 
about the 1000 Hz band. Output 818 will contain the delayed 

20 signal as it is passed through the circuit. Output 820 will 
contain both the inverted signal from the delayed equalizer 
and the non-inverted signal from input 816 as routed through 
the circuit. Note that the delayed frequency band is made to 
produce a signal at outputs 818 and 82 0 of equal intensity but 

25 of opposite polarity, since the intensity is equal, the image 
produced by such an arrangement is always at the antiphasic 
position. The fact that the experience of the listener is to 
hear the delayed frequency band to the right of the right 
loudspeaker and not to the back of the head can be best 

3 0 understood through a study of the estimable phenomenon of 

Temporal Fusion . 

If the two equalizer-delay-equalizer arrangements 
used in the above example are connected in an opposite manner 
so that the right signal is peak-delayed and the left signal 
35 is flat, the outputs will display an inverse situation. The 
listener will experience an equal but opposite listening 
experience with the peaked and delayed frequency band imaging 
to beyond the left loudspeaker. Thus the circuit can be said 
to be symmetrical with respect to its differences due tQ. 
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selective temporal imbalance. 

All temporal differences of 50 millisecond or less 
between inputs 814 and 816 behave in similar fashion to the 
above explanation. Multiple narrow-band temporal differences, 
5 simultaneously formed on either side of the stereo pair, are 
amalgamated into a coherent sound field through the action of 
the human brain and the excitation of Temporal Fusion. 

Ninth Embodiment 

Multiple Inputs in Professional Recording 
10 The manipulation system and apparatus is not limited 

to one panoramic potentiometer (panpot) control as shown in 
Figure 18 at numbers 804 and 806. A useful variation of the 
Eighth Embodiment is shown in Figure 19, and it comprises a 
ninth embodiment of the invention. This embodiment involves 
15 the use of multiple inputs. Buses 830 and 832 are extensions 
from inputs 822 and 824 of the circuit of Figures 15 and 18 
and said buses accommodate a plurality of panpots. In Figure 
19, inputs 814A, 814B, 814C, 814D and 814E are all connected 
to the left side of the respective panpots shown which in turn 
20 are connected to bus 830. Inputs 816A, 816B, 816C, 816D and 
816E are all connected to the right side of respective panpots 
shown which in turn are connected to bus 832. 

Signals inputted to any inputs 'A' through 'E' will 
exhibit an influence on the manipulation system and apparatus 
25 in a manner identical to signals inputted to inputs 814 and 
816 as previously disclosed. Thus, connecting the 

manipulation system and apparatus to a modern recording 
console in such a way so as to direct the signals form the 
console's combined panning buses to the inputs 814 and 816 
3 0 will, in effect, extend the use of the circuit of Figures 15 
and 18 to every panpot on the console. In such an arrangement 
the output of the circuit can be redirected back into the 
console's recording buses or to two isolated inputs that are 
in turn recombined with other isolated inputs to be combined 
35 into the line output of the console for purposes of recording 
on an analog or digital, magnetic or optical recorder. 



Monophonic Applications in Professional Recording 



WO 94/16538 PCT/US93/12688 



10 



54 



In professional recording situations, the engineer 
has control over the selective and broad-band spectral and/or 
temporal content of the various instrumental elements 
comprising a musical production. By varying the selective or 
broad-band spectral and/or temporal content of specific 
elements, in the manner descried above using the manipulation 
system and apparatus presented, the engineer can exhibit a 
keen degree of control over the image position of those 
elements; an image position which includes as its field of 
control that portion of the sound field which extends beyond 
the physical location of the two loudspeakers and to a point 
of at least 14 0 degrees from center in both directions. 
Furthermore the engineer can move any element in seamless 
progression along an arc extending from mid-point between the 
15 two loudspeakers to a point of at least 140 degrees from the 
center of the two loudspeakers to either the left or the right 
of the two stereo loudspeakers. 

Any number of monophonic signals may be 
simultaneously inputted into any number of inputs under the 
20 above arrangement, and each signal will be treated by the 
manipulation system and apparatus as if independent, that is 
to say, the manipulation of one signal will not influence the 
treatment of any other inputted signals. 

Stereophonic Applications in Professional Recording 
25 Within the above arrangement but not limited to it; 

if a stereo signal is inputted into any two inputs of a 
console such as 814A and 816C and the panner of 814A is 
rotated to the left (counter clock-wise) and the panner of 
816C is rotated to the right (clock-wise) , the action of the 
30 circuit of Figures 15 and 18 will be the same as if the stereo 
signal had been inputted to inputs 822 and 824 directly, as 
described in Embodiment Eight (or Embodiment Six for that 
matter) . 

Any number of stereo pairs may be simultaneously 
3 5 inputted into any two inputs under the above arrangement, and 
each stereo signal will be treated by the. manipulation system 
and apparatus as if independent. 
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Further, any number of stereo pairs and mono inputs 
may be simultaneously inputted into any combination of inputs 
under the above arrangement, and each input, be it mono or 
5 stereo, will be treated by the manipulation system and 
apparatus as if independent. 

By connecting the manipulation system and apparatus 
to a modern recording console and operating this invention in 
conjunction with the operation of a modern recording console, 

10 it can be said that the production advantages afforded by this 
invention can be retro-fitted to any console of common design 
through the use and application of these _circuits . 

This invention has been described with reference to 
a number of embodiments, and it is clear that it is 

15 susceptible to numerous modifications, modes and embodiments 
within the ability of those skilled in the art and without the 
exercise of the inventive faculty. By way of example, the 
conditioning signal in all embodiments is shown as being added 
to the left channel and subtracted from the right channel. 

2 0 This convention may be reversed, if desired, although it is 
believed that the electronics industry will follow the 
convention described herein for consistency. Other 
modifications are well within the skill of those skilled in 
this art. Accordingly, this invention is not limited to the 

2 5 disclosed embodiments, except as required by the appended 
claims . 
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WHAT IS CLAIMED IS 

1. An automatic stereophonic image enhancement 

apparatus comprising : 

first and second lines each having an input and an 

5 output ; 

a first circuit in said first line and a second 
circuit in said second line respectively between said 

input and said output; 

input connection means connected to said first and 
10 second lines between its input and its respective circuit for 
receiving a signal; 

frequency-dependant delay means connected to said 
input connection means for delaying the signal at said input 
connection means to produce a delayed signal; and 
15 control means for receiving said delayed signal, 

said control means having an output coupled to both of said 
devices in said first and second lines for delivery of the 
delayed signal thereto, said control means controlling the 
amplitude of the delayed signal so as to produce a delayed and 
20 amplitude controlled compensation signal to said circuits. 

2. The apparatus of claim 1 wherein said connection 
means comprises means for subtracting the signals on said 
first and seconds lines from each other. 

3. The apparatus of claim 2 wherein the means for 
25 subtracting provides a signal which is equal to the difference 

between signals applied at said inputs so that the output 
signals when added together form a monaural signal, said 
monaural signal having no artifacts present as a result of 
said apparatus. 

30 4. The apparatus of claim 2 wherein one of said 

circuits is arranged as an adder and the other of said 
circuits is arranged as a subtractor. _ - ~^ 

5. The apparatus of claim 4 further including output 
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connection means connected to said first and second lines 
between said circuits and said outputs, said output connection 
means sensing the signal in said first and second lines at 
said outputs and being connected to said control means for 
5 automatically adjusting said control means to maintain the 
compensation signal in said output lines substantially at a 
desired level. 

6. The apparatus of claim 5 wherein said output 
connection means includes a differencing device and a summing 

10 device both connected to said first and second lines adjacent 
the outputs thereof to produce dif f erenceand sum signals, a 
signal envelope detector connected to each said differencing 
and summing devices, a comparator connected to both of said 
detectors, said comparator having an output connected to said 

15 control means so that the compensation signal to said control 
means is controlled by said comparator for automatic 
adjustment of said compensation signal as a function of the 
sum and difference of the signal in said first and second 
lines adjacent the output thereof. 

20 7. The apparatus of claim 6 wherein further 

including a manually controllable device and a switch, said 
switch being operable to selectively connect said comparator 
and said manually controllable device to said controller so 
that the amplitude of said compensation signal can be 

25 selectively automatically and manually achieved. 

8. The apparatus of claim S wherein the output of 
said control means is connected through equalizers to said 
devices. 

9. The apparatus of claim 2 wherein a gate is 
30 connected in input-gating relationship to said control means, 

and comparator means is connected between said second-named 
connection means and said gate, said comparator means 
including input devices establishing a thresholcTratdo between 
first and second signal inputs to said comparator means, said 
35 comparator means sensing the presence of a monophonic signal— 
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for closing said gate for deactivating said control means 
during the presence of such monophonic signal. 



10. The apparatus of claim 1 wherein there are four 
input connections for connection to a quad bus, said four 

5 inputs being connected respectively to four input amplifiers, 
each of said input amplifiers having an output line, the 
output line of the first of said amplifiers being said first 
line and the output line of the second of said amplifiers 
being said second line and the output lines of the third and 

10 fourth of said amplifiers being respectively connected to said 
first and second circuits so that the, quad bus signals 
connected to said first and second amplifiers and in said 
first and second lines are subject to delay, phase shift and 
addition, while the quad buses connected to the third and 

15 fourth amplifiers contribute unprocessed signals. 

11. The apparatus of claim 1 wherein there are three 
input amplifiers for connection to three separate 
acoustically-related signal sources, the first of said 
amplifiers having said first line as its output, the second 

20 of said amplifiers having said second line as its output, and 
the third of said amplifiers having an output line connected 
additively to the input of one of said circuits and 
subtract ively to the input of the other one of said circuits. 

12. The apparatus of claim 1 wherein said control 
25 means is a voltage controlled amplifier. 

13. The apparatus of claim 1 wherein sid frequency 
dependant delay means comprises a band pass filter having a 
center frequency peaking in the range of 300 to 3000 Hz. 

14. The apparatus of claim 13 wherein said frequency 
30 peaks at 500 Hz. 

15. The apparatus of claim L wherein said delay 
means for producing a delayed signal is a digital delay means 
and digital filter means. 
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16. An audio image enhancement apparatus comprising: 
(a) first and second audio inputs which are coupled 

to first and second audio microphones located relative to a 
sound source ; 

5 (b) first and second enhanced audio outputs; 

(c) a source of time-delayed audio signals, which 
arrive later than corresponding signals on said first and 
second audio inputs, said source comprising a third microphone 
located further from said source than either said first or 

10 second microphone; 

(d) a variable gain circuit in circuit coupled to 
an output of said source; 

(e) an adder having a first input coupled with said 
variable gain circuit, a second input coupled to receive said 

15 first signal and having an output coupled with said first 
enhanced audio output; and 

(f) a subtractor having a first input coupled with 
said variable gain circuit, a second input coupled to receive 
said second signal and having an output coupled with said 

2 0 second enhanced audio output. 

17. The audio image enhancement apparatus of claim 
16 where said first, second and third microphones are 
positioned at the apexes of a triangle which has a base 
confronting said sound source and wherein said third 

2 5 microphone is located at the apex opposite said base and 
furthest from said sound source. 



18. An audio image enhancement apparatus comprising: 

(a) first and second audio inputs; 

(b) first and second enhanced audio outputs; 

30 (c) a source of audio signals, said source including 

a subtractor circuit for subtracting a first signal 
communicated via said first input from a second signal 
communicated via said second input, and a filter coupled said 
subtractor circuit, said filter comprissing a narrow band pass 

35 filter having a center frequency in the raTige of 300-3000 Hz; 

(d) an adder having a first input coupled with said " 
source, a second input coupled to receive said first signal" 
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and having an output coupled with said first enhanced audio 
output ; and 

(e) a subtractor having a first input coupled with 
said source, a second input coupled to receive said second 

5 signal and having an output coupled with said second enhanced 
audio output. 

(f) a subtractor having a first input coupled with 
said source, a second input coupled to receive said second 
signal and having an output coupled with said second enhanced 

10 audio output . 

19. The audio image enhancement apparatus of claim 

18 wherein said source includes a time delay circuit. 

20. The audio image enhancement apparatus of claim 

19 wherein said audio source includes a variable gain circuit. 

15 21. The audio image enhancement apparatus of claim 

20 further comprising a switch for selectively coupling the 
output of the subtractor circuit directly to said variable 
gain circuit or via said filter and time delay circuit. 

22. The audio image enhancement apparatus of claim 
20 20 further comprising a feedback circuit coupled to the 
outputs of said adder and said subtractor and to said variable 
gain circuit for adjusting the gain of said variable gain 
circuit in response to detected levels of spatial information 
at the enhanced audio outputs. 

25 23. A audio image enhancement apparatus comprising: 

(a) a plurality of audio inputs; 

(b) first and second enhanced audio outputs; 

(c) a bus having a plurality of audio lines; 

(d) a plurality of joysticks, each joystick 
3 0 being associated with an audio input, audio each 

input having an input circuit for steering 
different amounts of the signal at said audio input 
onto the lines of said bus based upon the position 
of said joystick; 
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(e) a sound processing circuit coupled to at 
least a pair of lines of said bus; and 

(f) output amplifiers having outputs coupled 
to said enhanced audio outputs, each output 

5 amplifier having inputs for summing signals on 

selected ones of said audio lines in said bus and 
also for summing signals output from said sound 
processing circuit. 

24. The audio image enhancement apparatus of claim 
wherein said sound processing circuit comprises: 

(i) a subtractor circuit for subtracting 
a first signal communicated via a first one of said 
pair of lines from a second signal communicated via 
a second one of said pair of lines; 

(ii) a filter for frequency filtering 
output of said subtractor; 

(iii) an invertor for inverting the output 
of said filter; 

and wherein outputs of said filter and said 
invertor are coupled to respective ones' of said 
output amplifiers. 

25. The apparatus of claim 24 wherein said filter 
has a peak in the range of 300-3000 Hz. 

26. The apparatus of claim 25 wherein said peak is 

25 at 500 Hz. 

27. The audio enhancement apparatus of claim 23, 
further including a mono-compatibility bus in said plurality 
of audio lines, said mono-compatibility bus being coupled to 
said output amplifiers via an equilization filter, said 

30 equilization filter being a low and high pass filter. 

28. The audio enhancement apparatus of claim 27 
wherein equilization filter passes audio information in 
frequency bands outside a frequency band passed by said sound 
processing circuit. 
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29. A method of enhancing audio information 
comprising the steps of: 

(a) subtracting a pair of audio signals from 
each other; 

5 (b) frequency dependent time-delaying the 

results of said subtracting step; 

(c) adding the results of said time-delaying 
step to one of said pair of audio signals; and 

(d) subtracting the results of said 
10 time-delaying step from the other of said pair of 

audio signals. 

30. The method of claim 29, further including the 
step of controlling the amplitude of the results of the 
time-delaying step before those results are added to or 

15 subtracted from said audio signals. 

31. The method of claim 30 further including the 
step of detecting the spatial content of said audio signals 
and controlling the amplitude of the results of the time- 
delaying step based on the results of the detecting step in 

20 order to keep the spatial content of said audio signals within 
certain predetermined limits. 
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